Vol. 87, Parts 3 and 4 December 1950 


BIOMETRIKA — 


FOUNDED BY 


W. F. R. WELDON, FRANCIS GALTON anp KARL PEARSON 


MANAGING EDITOR 


EK. S. PEARSON 


ASSOCIATE EDITORS 
M. G. KENDALL JOHN WISHART 


in consultation with 
HARALD CRAMER R. C. GEARY 
J.B. S. HALDANE 


ISSUED BY 
THE BIOMETRIKA OFFICE, UNIVERSITY COLLEGE, LONDON 


PRINTED AT THE UNIVERSITY PRESS, CAMBRIDGE 


[Issued 18 December 1950] 











Se eae 


—_ 

















VoLuME 37, Parts 3 AND 4 DECEMBER 1950 





A SIMPLE STOCHASTIC EPIDEMIC 


By NORMAN T. J. BAILEY 
Department of Medicine, University of Cambridge 


CONTENTS 

PAGE 

1. General introduction . . ° ° ° ‘ . , . - 193 
2. The importance of stochastic means in epidemics . . ° ° - 194 
3. Deterministic treatment of a simple epidemic . ° ° ° ° - 194 
4. Stochastic treatment of a simple epidemic ° ° . . - 196 
(a) Solution of stochastic differential-difference eames . ‘ - 196 

(6) Stochastic mean values . * - rs : : . ° - 198 

(c) Completion times . ‘ : ° ° ° ° . ° - 200 

5. Summary ° 6 ° ° ° ° ° . ° ° ° - 202 


1. GENERAL INTRODUCTION 


The mathematical theory of epidemics has usually been confined to the consideration of 
‘deterministic’ models as, for example, in the work of Kermack & McKendrick (1927 and 
later) and Soper (1929). That is, it has been assumed that, for given numbers of susceptible 
and infectious individuals and given infection and removal rates, a certain definite number of 
fresh cases would arise in a given time. In fact, as is well known, a considerable degree of 
chance enters into the conditions under which fresh infections take place, and it is clear that 
for a more precise analysis we ought to take these statistical fluctuations into account. In 
short, we require ‘stochastic’ models to supplement existing deterministic ones. 

Bartlett (1949) has emphasized this nced and has devoted some discussion to various 
partial attacks already made on the problem. A brief reference has also been made by 
Bartlett (1946, pp. 52-3) to the simple stochastic epidemic process considered in greater 
detail in the present paper. 

In deterministic treatments the total number of cases is a single-valued function of time, 
but in stochastic treatments the single point on the deterministic curve is replaced by a 
probability distribution. The usual deterministic epidemic curve gives the rate of change with 
respect to time of the total number of cases (regarded as continuous), while the most appro- 
priate stochastic analogue is probably the curve of the rate of change with respect to time 
of the stochastic mean. The latter statement needs to be suitably qualified and is discussed 
in greater detail in the following section. In some processes stochastic means are identical 
with deterministic values, but this is not the case in epidemic processes. It is worth remarking 
in passing that the rather unexpected smoothness of observed epidemic curves is most likely 
to be due to the partial ironing out of statistical variations, by summation over finite periods 
of time (e.g. when quoting so many new cases per day or week) and by summation over 
relatively isolated epidemics occurring simultaneously in subgroups of the main population. 

The present note deals with the very simplest case of the spread of a relatively mild in- 
fection, in which none of the infected individuals is removed from circulation by death, 
recovery or isolation. This is admittedly an over-simplification, but, apart from providing 
a possible basis for more extensive investigations, it may well represent the situation with, 
for example, some of the milder infections of the upper respiratory tract. 

Biometrika 37 13 














194 A simple stochastic epidemic 


2. THE IMPORTANCE OF STOCHASTIC MEANS IN EPIDEMICS 


It is usual to assume that the probability of a new case occurring in a small interval of time 
is proportional to both the number of susceptible and the number of infectious individuals. 
These assumptions are reasonable for areas small enough for homogeneous mixing to take 
place. This is clearly not so with large areas, for which it is well known that the overall epi- 
demic can often be broken down into smaller epidemics occurring in separate regional sub- 
divisions. These regional epidemics are not necessarily in phase and often interact with each 
other. Taking the process of subdivision a stage further we can consider a single town or 
district. Even here it is obvious that a given infectious individual has not the same chance 
of infecting each inhabitant. He will probably be in close contact with a small number of 
people only, perhaps of the order of 10-50, depending on the nature of his activities. The 
observed epidemic for the whole district will then be built up from epidemics taking place 
in several relatively small groups of associates and acquaintances. Although in practice the 
groups may overlap, we can employ the concept of an effective number of independent groups, 
and it may be possible to assume, as a first approximation, that they are equal in size. 


A typical model would involve a community of, say & independent groups each of size n. 


We can imagine an epidemic started by the simultaneous appearance or introduction of 
k primary cases, one for each group. 

Stochastic means are often not very informative because of the large amount of variation 
associated with them. But in the model suggested, at any given time, the coefficient of 
variation of the total number of cases will be 1/ alk times the coefficient of variation of any 
one of the groups. Thus the larger the value of k, the more nearly will the curve of the total 
number of cases approach in shape the curve of the stochastic mean for a population of 
size n; and we may expect the overall epidemic curve to approach in shape the curve of the 
rate of change with respect to time of the stochastic mean. In epidemic processes stochastic 
means are not’ the same as the deterministic values, so that special treatment is required. 

Although the above model is rather over-simplified there is some justification for regarding 
the epidemic curve derived from the stochastic mean as the appropriate stochastic analogue 
corresponding to the classical deterministic epidemic curve. 


3. DETERMINISTIC TREATMENT OF A SIMPLE EPIDEMIC 
Let us consider a community containing n susceptibles into which a single infectious in- 
dividual is introduced. We shall assume that the infection spreads by contact between the 
members of the community, and that it is not sufficiently serious for cases to be withdrawn 
from circulation by isolation or death; also that no case becomes clear of infection during 
the course of the main part of the epidemic. These are wide assumptions but, as already 
suggested, are probably nearly fulfilled with some of the milder infections of the upper 
respiratory tract. 
Let y be the number of susceptibles at time ¢, and let # be the infection rate. Then the 
oumber of new infections in time dt is y(n — y+ 1) dt. If we replace t by ft we shall find that 
equation is dimensionless so far as the infection rate is concerned. It is easy to see that 
.@ deterministic differential equation is 


dy 
pe fe i ( 
_ y(n—y+1), (1) 


satisfying the initial condition y=n when t=0. (2) 





f time 
luals. 
. take 
ll epi- 
| sub- 
each 
vn or 
lance 
er of 
. The 
place 
e the 
oups, 
size, 
Ze N, 


mn of 


ation 
nt of 


F any 
total 
on of 
f the 
astic 


y 
Ae 


ding 


ogue 


s in- 
| the 
awn 
ring 
ady 
pper 


the 
that 
that 


(1) 
(2) 





Norman T. J. BAILEY 195 


Equation (1) above corresponds to equation (22) given by Bartlett (1946, p. 53). Bartlett’s 
main variable is the number of infectious individuals, whereas ours is the number of 
susceptibles, and he presumably starts his process with one infectious individual and n—1 


susceptibles. . 
The solution of (1) and (2)is y= n(n+1)/{n+e™*4. (3) 


z 


30 an 

















r o1 02 03 0-4 05 0% 07 08 09 ft 
Fig. 1. Comparison of deterministic and stochastic epidemic curves for n = 10. 
----- deterministic curve; ———— stochastic curve. 


120} 











oss O10 O1S.SCS*~CSCiCiCS SSO AE 
Fig. 2. Comparison of deterministic and stochastic epidemic curves for n = 20. 
----- deterministic curve; ————— stochastic curve. 


Thus the deterministic epidemic curve is © 
a. oe _ alnt lent 4 
t=—-2 ewe gtie {n+ entine ° ( ) 
The curve reaches a maximum when t = logn/(n+1), y = }(m+1) and z= }(n+1)*. It 


is clearly symmetrical about t = log n/(n + 1). 
13°2 











196 A simple stochastic epidemic 


The epidemic curve given by (4) is plotted for n = 10 and n = 20 in Figs. 1 and 2 respec- 
tively, where the corresponding curves for the stochastic cases are given for comparison. 

Our solution given in (4) does not agree, making due allowance for the change of notation 
and variable, with Bartlett’s solution (23). However, as the latter does not seem to satisfy 
the apparent initial conditions, there must be a misprint. 


4. STOCHASTIC TREATMENT OF A SIMPLE EPIDEMIC 


(a) Solution of stochastic differential-difference equations 
Let us use the same notation as in §3. Then replacing t by ft as before, it is easy to see that 
on the assumption of homcgeneous mixing the probability of one new infection taking place 
in the interval dt is y(n—y+1)dt. Now suppose that p,(t) is the probability that there are 
r susceptibles still uninfected at time t. Then the usual treatment shows that the epidemic 
process is characterized by the stochastic differential-difference equations: 


BPO) = (741) (n—r)Praalt)—rm—r + I)pylt) (r= 0,1,2,...,(n=1)), me 
and Pall = —np,(t). 


I have to thank Mr D. G. Kendall for drawing my attention to a paper by Feller (1939). 
Feller’s equation (19) is substantially the same as our (5), though it was obtained in a different 
context. Feller gives the solution of sets of equations of this type with generalized coefficients. 
On the right-hand side of our (5) every coefficient appears twice, leading to terms of the type 
ate-' as well as ce-” in the solution. We can use Feller’s solution (21) if we apply the usual 
limiting procedure to the terms which have the form 0/0, but it is more convenient for our 
purpose to solve ab initio as follows. 

Let us use the Laplace transform and its inverse with respect to time given by 


$*(A) = I "eMg(t)dt (R(A)>0), 
, (6) 
dit) 


‘c+ico c+iw 
where | = lim , and c is positive and greater than the abscissae of all the residues. 
c—io wo c—iw 


an "9 ore OM S*(A) dr 
th 2m c—iwo : 


Taking transforms of the equations in (5) and using the boundary conditions 


p(0)=1 (r=n) } 


=0 (r<n), 


(7) 


we obtain the recurrence relations 


r+1)(n—r ) 
pm tee (r = 0,1, 2,...,(n—1)) 
; (8) 
1 
and In = Asn’ 
where q, = pe -| e~*' p(t) dt. 
0 


It follows from (8) that g, is given by 


,- eo “TL A+im—J+ 0} (O<r<n). 





~~ Sh ew eed 


spec- 
n. 

ition 
tisfy 


that 
lace 
> are 
>mic 


(6) 


ues. 


(8) 


(9) 


Norman T. J. Barry 197 


It is important to note that if r>4(n+1), then the factors in the denominator are all 
different, while if r < }(n+ 1) some of them are repeated. In the latter case we can write the 
denominator as 


(A +n) {A + 2(n—1)} {A+ 3(n—2)}... A+r(m—r+ L}P{A4 (r+ 1) (n—-1)}... 
x {A+ (4n — 1) (40 + 2)}* {A + 4n(4n4+1)}, for neven, (10) 





and 
(A+ n) {A+ 2(n—1)} {A+ 3(n—2)}... {A+ r(n—r4+1)P {At (r+ 1) (n—1)}... 
«fas SEDO a+ oe, for n odd. (11) 


Thus all terms after the (r— 1)th are squared, unless n is odd, in which case the last term is 
not squared. 

We can now see the general character of the solution. To find p, we merely express q, in 
partial fractions and use the inverse of the Laplace transform. Terms in the denominator 
like {A+r(n—r+1)} and {A+r(n—r+1)}* will give rise to exp{—r(n—r+1)t} and 
texp{—r(n—r+ 1) t} respectively. The coefficients of the latter terms are simply the coeffi- 
cients of the corresponding terms in the expansion of q, in partial fractions. 

In particular, we have 











Yo = (n!)?/[A(A + n)? {A + 2(n— 1)}?...] (12) 
ahs >> . + fr (13) 
AS AA+r(n—r4t I} fA4r(n—r+ I}? 
(n!)? (n — 2r + 1)? 
oi es 2 TE 
where k, = Q{At+r(n—r+ 1)} a cllagtins He= i woriinor ae (14) 
Now if the probability generating function is 
I(z, t) = > 2'Prlt), (15) 
r= 
then it can be seen from (5) that II(z, ¢) satisfies the partial differential equation 
oll on TT 
= 2) [nS 2, (16) 
with the boundary condition Il (a, 0) = 2”. (17) 


The equations for the moment-generating function, M(0,t), are derived from (16) and (17) 
by writing x = e’. We find 
OM aM 
Ot a sit 
with the boundary condition M(9,0) = e”. (19) 


r) 
= (e-? 1) (n+ a 


Our (18) is substantially the same as equation (20) given by Bartlett (1946, p. 53), making 
proper allowance for the change of notation and variable. 
Suppose that the rth moment of the distribution of y is m,, then we can substitute 


2 
M(8) = L+my0-+ mio, +. (20) 











198 A simple stochasiic epidemic 
in (18), and equate coefficients of 0 to give the following set of differential equations: 
dm, 





R= {n+ 1) mim}, | 

d 

OP — + {(m +1), —mj} — 2{(n-+ 1) my — m3}, (21) 
a8 = —{(n-+ 1) mj — mis} + 3{(m + 1) mig — ms} —- 3{(m + 1) ms — m4}, 

etc., ' 





where the numerical coefficients on the right-hand side can be derived from binomial ex- 
pansions. 


Unfortunately, these equations, while capable of giving the higher moments in terms of 


m, when the latter has been found, dad 
4m 
‘dt’ 


are not so convenient for finding m, itself. They can obviously be used to develop a Taylor 
expansion for m; in powers of ¢ by successive differentiation and substitution, since all the 
moments are known when t = 0. We have, in fact, 
n(n— 2) 12 _ n(n? —8n+ 8) 5s 
2! : 3! 
However, the series does not converge rapidly enough to be very useful. 
Thus we see that the usual method of equating coefficients of @ in the partial differential 
equation for the moment-generating functidn, which often leads to simple differential 
equations for at least the early moments, fai’: to be of service in the case of stochastic epi- 
demic processes. In the next subsection we shall consider a different method of approach. 


= (n+1)m,+ (22) 


U 
Mm, = n—nt— 





(23) 


(b) Stochastic mean values 
Let the transform of the probability-generating function be 
I1*(x,A) = S.2"¢,. (24) 
r=0 
Referring to equations (9) to (13), it is clear that we can write 


f(x) g,(x) 
A} i+ Bina r+ {A+r(n— i} (25) 


where f,(x) and g,(x) are polynomials in x. Thus the probability-generating function itself 





is of the form II (a, t) = 1+ x {tf,(a) +9,(a)} er 44¥, (26) 
=1 
Therefore m;(t) = el. = Ste 1) +g/(1)} err +, (27) 


where, in the expression on the pire side, primes are used to indicate differentiation 
with respect to 2. 


Now the transform of (16) shows that II*(x, A) satisfies the differential equation 


2,1, * * 
x12) —n(l— — 2)" 5 ants = 2". (28) 


(21) | 


ial ex- 
rms of 


(22) 


Taylor 
all the 


(23) 


ential 
ential 
ic epi- 
ach. 


(24) 


(25) 


itself 
(26) 


(27) 


iation 


(28) 


NormMan T. J. BAILey 199 


If we substitute (25) in (28), multiply by {A++r(n—r + 1)}* and then put A = —r(n—r+1), 
we obtain x(1—a) fy —n(1—2) fy —r(n—1 +1) f, = 0. (29) 
The solution of (29) is f(z) = CF{-r, —n+r—1; —n,2}, (30) 
where F is a terminating hypergeometric series and C an arbitrary constant. However, 
C is evidently the coefficient of {A+1r(n—7r+1)}-* in the partial fraction expansion of qo, 


ie. C = k,, whose value is given in (14). Substituting this value in (30), differentiating with 
respect to a and then putting x = 1 gives 





; dF 
f-(1) wr ka Loli 
= heart a —r+ 1, —n+r; —n+1, l}. 
Therefore f-() = nen (31) 
Specimen values of these coefficients, occurring in the expression for m;(t) given by (27), are 
r fr) ) 
1 n(n—1)? 
2 n(n — 1) (n—3)?/1! ' (32) 
3 n(n — 1) (n—2) (n—5)?/2! 
+ n(n — 1) (n— 2) (n—3) (n—7)?/3! 
etc. y 





To find the polynomials g,(x) we substitute (25) in (28), multiply by {A+r(n—r+ 1)}*, 
differentiate with respect to A and then put A = —r(n—r+1). This gives the following 
qa: ae(1 2) gp —n(1—2) g,—r(n—1 + 1) 9, = —f,. (33) 
From (33) we can derive a series solution for g,(x) in terms of the known f,(x). I have un- 
fortunately been unable to find a simple and convenient general expression for g,(x), although 
it is easy to show that i 


gi(1) = n—n(n—1)(144444...4 5). (34) 


In view of the importance of the stochastic mean it was thought worth while to examine 
one or two cases in detail. We can calculate the probability-generating function as indicated 
above and from it derive the formula for the mean. The chief labour is in calculating the 
coefficients in the partial fraction expansions of expressions like (9), but it can, however, be 
materially reduced by suitably schematizing the computations. 

The mean value has been found for the special cases, n = 10 and n = 20. The formula 
for n = 10 is given below explicitly, and in both cases the epidemic curves are plotted in 
Figs. 1 and 2, where they are compared with the corresponding deterministic curves. 

The expressions for the stochastic mean and the epidemic curve in the special case, n = 10, 
are 

mi, = e181 0t — 23442) + e-18(44.10t — 902}) + e-*#(9000t — 12474) 

+ e-98(7560t + 126) + e-3%(1260t+ 2268), (35) 
z=- st = e-10(8100t — 3156;1,) + e(79,380t — 20,6504) + e-2#(216,000t — 38,9313) 


+ e~8(211,680t — 4032) + e-8(37,800¢ + 66,780). (36) 











200 A simple stochastic epidemic 


It is evident that, for both n = 10 and n = 20, the stochastic epidemic curve has a somewhat 
different character from the deterministic curve. The latter issymmetrical about its maximum 
ordinate, whereas the former is not and falls more slowly than it originally rose. On the other 
hand, it may be noticed that the time at which the maximum occurs in the stochastic case 
does not differ very much from the time of the maximum in the deterministic case. 

It is perhaps worth mentioning here that Feller’s remark (1939, p. 22) about the stochastic 
mean always being less than the deterministic value is easily seen, from a comparison of our 
equations (1) and (22), to hold in the present case, provided we apply it to the mean number 
of infectious individuals (not the mean number of susceptibles)—as we should if the correct 
analogy is to be made with the process considered by Feller. In order to prevent confusion 
it should be remembered that Figs. 1 and 2 show the epidemic curve, i.e. the rate of change 
with respect to time of the mean number of infectious individuals. 


(c) Completion times 


Let us call an epidemic complete when all the available susceptibles have been exhausted; 
otherwise we shall say it is incomplete. It can be seen from (26) that II(z,0o) = 1; that is, 
the simple epidemic under consideration is always completed eventually. For more com- 
plicated types of epidemic this is not necessarily so, for the infected individuals may all 
be removed before the epidemic is complete. 

Now p,(7) is the probability that the epidemic is complete at time 7. But since the number 
of susceptibles is a non-increasing function, (7) is also the chance that the epidemic has 
been completed in the interval from 0 to 7. Thus (7) is the distribution function and dp,/dr 
the frequency function for the completion time 7. 

The moment-generating function for the completion time is 


M,(0) = Be 


and © dpy Or 
-| dr’ dr 


= [ 2 rr - 6", e*rdr, integrating by parts, 

= —6q,(—9), for 6 <0, 
since P90) = 0, po(00) = 1. 
Therefore M,(9) = —6q)( - 49). (37) 
Substituting for g, from (12) we obtain 

M,(0) = (nty¢/ T1 {8+ 5-5 +1) 
-f {1- ra: ei ih (38) 
j=1 j(n—j+1)) 


The cumulant-generating function is then given by 


K,(0) = - Slog{1- "| 39 
wi ya el inj +1) _ 
and the rth cumulant is evidently 

k= (r-I!E ao (40) 


jar j’(n—j+1y 





If 


what 
mum 
other 
Case 


astic 
f our 
m ber 
rrect 
ision 


ange 


sted; 
it is, 
om - 
y all 


aber 
has 
[dr 





Norman T. J. BarLey 


201 


Each term on the right-hand side of (40) can be expanded in a series of partial fractions. 
If we collect together quantities with the same index we can write 




















K, = 2%r—1)! 24S, 
where a, = r(r+1)...(2r—p+1)/(r—p)! (n+1)"" (p<r), is 
a, = 1/(n +1) i 
» 1 
and S, = 2 
Thus the first four cumulants are 
2 
4 2 
o> aie tie (42) 
24 12 4 
K3 = (n+ pe s+ (n+ jyeSet (n+ 1S» 
240 120 48 12 
Fe.5c (n+ p+ (n+ 1s 52+ (n+ pe Set (n+ 1S 


I am indebted to Dr J. Wishart for pointing out to me that 
computed by writing 








S,,(n) is in general most easily 


S,(n) = Y(n+1)—Y(), 
(—1)?2 . (43) 
Sm) = Fi HO Ment 1) YOM} (p> D, , 


since values of the Polygamma Functions (x), y(x), (zx), .. 


. are readily available from 


Tables of the Higher Mathematical Functions by Davis (1933, 1935). 
For small n the cumulants are most easily calculated directly from (40) and for large n 
we can obtain asymptotic formulae by using the well-known expansions 


1 Bl Bl 
eoreaehpenercnne , 
1 1/1 B,/p\1l B, 
on [Klel= eaaeal tel saat e 


(44) 


(3-H) 


where ¢(p) is Riemann’s ¢-function, y is Euler’s constant and the B’s are Bernoulli’s numbers. 


It is evident from (41), (42) and (44) that for large n 


2 
K, = To pa GST 


(45) 
K, = aait 1y - C( na yin i (r>1). 
Thus the coefficient of variation is asymptotically equal to 
71/2 /3logn, (46) 











202 A simple stochastic epidemic 
and the limiting values of y, and y,., the coefficients of skewness and kurtosis, are given by 


ny, = 263) 
n>o {(2)}# (47) 





lim y, = = 

ate) OP 
Values of 7, o,, Y1, Y2 and the coefficient of variation are given in the following table for 
n = 10; 20, 40, 80 and oo. 


Some characteristics of the distribution of completion time 








n t Co, V1 V2 ce. of v. (%) 
10 0-533 0-186 0-831 1-169 34-8 

20 0-343 0-0938 0-774 1-081 27-4 

40 0-209 0-0467 0-764 1-086 22-4 

80 0-123 0-0231 0-771 1-114 18-9 

©0 ee ae 0-806 1-200 0 


























Thus it is evident that appreciable skewness and kurtosis remain even with large n. 
Furthermore, the coefficient of variation shows that for idealized communities in which the 
group size is 80 or less there will be considerable differences between groups with respect to 
the time taken for all the susceptibles of a group to become infected. 


5. SUMMARY 


Classical mathematical investigations into the theory of epidemics have usually been 
deterministic, i.e. they have not taken probabilities into account. The present note attempts 
to make good this deficiency for a simple epidemic, where we have the spread of a relatively 
mild infection, in which none of the infected individuals is removed from circulation by death, 
recovery or isolation. 

It is suggested that in general epidemic curves derived from stochastic means for the appro- 
priate mathematical model would be likely to bear a close resemblance to the published 
returns for actual epidemics, because it is considered that the latter are in fact summed over 
a number of epidemics occurring in small groups of associates and acquaintances. 

Curves of the stochastic means for the simple epidemic under consideration are given in 
the special cases when the group sizes are 10 and 20. The characteristics of the distribution 
of the time taken for the number of susceptibles to become exhausted are also discussed. 


REFERENCES 

Barttett, M. 8. (1946). Stochastic Processes (notes of a course given at the University of North Carolina 
in the Fall Quarter, 1946). 

Barrett, M. 8. (1949). Some evolutionary stochastic processes. J. R. Statist. Soc., Ser. B, 11, 2. 

Davis, H. T. (1933, 1935). Tables of the Higher Mathematical Functions, 1,2. Indiana: Principia Fss. 

Fetter, W. (1939). Die Grundlagen der Volterraschen Theorie des Kampfes ums Dasein in wahr- 
scheinlichkeitstheoretischer Behandlung. Acta Biotheoretica, 5, 11. 

Kermack, W. O. & McKenprick, A. G. (1927 and later). Contributions to the mathematical theory of 
epidemics. Proc. Roy. Soc. A, 115, 700; 138, 55; 141, 94. 

Sormr, H. E. (1929). Interpretation of periodicity in disease-prevalence. J. R. Statist. Soc. 92, 34. 











ae atk foe eee ak eee 


ina 











[ 208 ] 


ON THE FISHER-BEERENS TEST 
By G. A. BARNARD 


The Fisher-Behrens test was proposed to solve the following problem: We are given two 
samples, of n, and n, members, respectively, from two normal populations, means //;, /, and 
variances o?, 73, respectively. We wish to tesi the hypothesis that ~, = , = 4, say, without 
assuming that 0, = o,. Fisher’s-test is based on d, the standardized difference of the sample 


SE 4,» Mas d = (x1, —2,)//{(s3/n1) + (83/n2)}, 


where s?, s2 are the variance estimates obtained from the two samples. The significance levels 
for d are found from tables calculated by Sukhatme, which are entered at a point given by 
a ‘scale factor’ calculated from tan = K,/K,, 


where K; = ,/(n,)/s; (i = 1,2). It has been objected to Fisher’s test that the probabilities 
given by Sukhatme’s tables are not in all cases equal to the probabilities that d should exceed 
given values, calculated on the assumption that ~, = uw, and that n, and n, are fixed in adv ance. 
Fisher has replied to this objection by asserting that no such equality of the probabilities 
is required. Specifically, he has asserted that the requirement that the probabilities involved 
in the test should be calculated on the basis of a reference set with fixed n, and n, is foreign 
to the true logic of the test procedure. 

The primary object of this note is to point out that it is possible to define a reference set 
within which the probabilities given by Sukhatme’s tables do, to a degree of approximation 
universally accepted, represent the true probabilities that d will exceed given values, cal- 
culated on the hypothesis that 1, = ,. The observation that this was possible arose out of 
work done by Mr H. Ruben, one of my research students, in which he was independently 
covering and extending some work done and published by C. Stein. To make this note self- 
contained, we shall briefly recapitulate, with inessential modifications, Stein’s and Ruben’s 
procedure. 

Ruben and Stein begin together by considering the problem of finding confidence intervals 
whose length has a prescribed maximum /, with prescribed confidence coefficient 1 -- «, for the 
mean 4 of a normal population of unknown variance o?. To do this, they tell us to choose 
arbitrarily an integer n’, not less than 2. We then find from the tables of the distribution of 
‘Student’s’ ¢, the value of ¢ corresponding to the significance level 1 — a (two-sided test) on 
n’' — 1 degrees of freedom. Call this value t,. We then calculate 


k = 2t,/l. 
We now take a first sample of n’ observations, and from it calculate the estimate of the 
variance s?, We then take further observations, to a total number 7, such that 
n—l<k*s?<n. 


If it turns out that n’ is already too large to satisfy this inequality, we can take the first n of 
the n’ observations already made. We now calculate the sample mean x of the n observations. 
We shall then have ,/n (x —)/o normally distributed with mean 0 and unit variance, while 
(n’ — 1) 8?/o? is independently distributed in the x? distribution with n’ — 1 degrees of freedom. 
Consequently the ratio n(x. —)/s 











204 On the Fisher-Behrens test 
has the ¢ distribution with n’— 1 degrees of freedom. Hence we can make the assertion 
% —t,8/n<p<a +t,s]/n 
with confidence coefficient «. The length of the confidence interval is 2t,s/,/n, and from the 


inequality above it is clear that this cannot exceed 1. 
Before proceeding further we should notice that the inequality for n above tells us that 


{1 —(1/n)} .(n)/8< k < /(n)/s, 


so that the proportional error in replacing ./(n)/s, by k cannot exceed ./{1—(1/n)}. Further, 
the same inequality tells us that the mean value of n, &(n) satisfies 
@(n—1)<k*o*? < &(n), 

and the distribution of n will be governed by that of s*, and in fact it will be a ‘grouped’ 
form of the distribution of k*s*. Without making a further detailed analysis (which has in 
fact been made by Mr Ruben), it is evident that the proportional error involved in replacing 
V(n)/s by & will be of the order of 1/(k®0*) which is of the order of [?/o?. Thus the error will be 
small, provided that the precision with which the mean y is required to be estimated is high 
compared with the standard error of a single observation. The error itself arises because of 
the fact that the sample size n has necessarily to be ‘an integer, while k2s? need not be an integer. 
Thus the situation is closely comparable with that occurring in the theory of sequential 
tests, where the true risks of error, ~’ and f’, are not exactly equal to the prescribed risks, 
a, 8, for the reason that a whole number of observations must be taken; here, too, the error 
involved in equating a’ and f’ with a and £ is small when the test being made is a sensitive 
one—i.e. when a’ and f’ are themselves small, and the mean sample size is consequently 
large. In view of this discussion we shall henceforward assume that it is legitimate to replace 


K = ,/(n)/s by k, and consequently that with the double-sample procedure we have given, 
we can take k(x. —p) 


to be distributed as ¢ on n’— 1 degrees of freedom. 

If we turn now to the case where two samples, one from each of two populations, are 
involved, if the two samples of total size n,, ng, respectively, are obtained by double-sampling 
procedures like that described, with values nj, k, fixed in advance for sampling from the first 
population, and m3, k, fixed in advance for sampling from the second population, then we 
shall have the quantities k(x e—m) (é=1,2) 


distributed as ¢ on n; degrees of freedom, approximately, and the two quantities will ob- 
viously be independent. Thus their joint distribution will be (approximately) that used as 
the basis for Sukhatme’s tables, and this fact can be used to derive a method for determining 
confidence limits, of prescribed maximum length, for the difference ~,—j.. This procedure 
is due to Mr Ruben, and the details have been worked out by him: 

So far, the argument has proceeded along Neyman-Pearson lines. We now revert to the 
Fisher-Behrens problem, and cease to argue along these lines. The Fisher-Behrens problem 
arises when we are given two samples, sizes n,, means x ;, with variance estimates s?. We do 
not in general know how these samples were obtained, beyond what is required to justify 
us in regarding them as consisting of independent observations, randomly chosen from the 
two populations in question. There seems, therefore, to be nothing in the data which would 
prevent us from assuming that the two samples in question arose as a result of the carrying 
out of double-sample procedures such as have been described, in which the constants chosen 








G. A. BARNARD 205 


in advance happened to be nj = n,,andk, = ,/(n,)/s;. Ifthe sampies had in fact been obtained 
by such a procedure, then in repetitions of the procedure the quantities 
k,(%4—) 

would be independently distributed as ¢, on n,—1 degrees of freedom, assuming the null 
hypothesis 4, = “2 = #. The standardized difference between the sample means, d, would 
then in fact be distributed as calculated in Sukhatme’s tables, and the frequency of ‘errors 
of the first kind’ would be correctly given by Fisher’s test, up to the approximation already 
discussed. 

It thus appears that provided the reference set is taken as a double-sampling reference set, 
in which n; and k; are held fixed, the Fisher-Behrens test has a ‘frequency justification’. 
Those who argue that the test should not be used, on the grounds that it gives a wrong 
estimate of the frequency of ‘errors of the first kind’, are under the necessity of justifying the 
choice of the reference set in which n, are held fixed. 

One method of justifying this latter choice which the present writer has met with consists 
in saying that in fact, in the Fisher-Behrens problem, the sample sizes n, are fixed in advance 
by the choice of the experimenter. This line of argument is not often taken by statisticians 
who have much to do with giving practical advice to experimentalists, and the reason for 
this is not far to seek. The present writer’s own experience is that the sample sizes that one 
meets with in giving practical advice can rarely be said to be fixed in advance by the choice 
of the experimenter. Of course, one tries to get people to estimate in advance how many 
observations are necessary to determine the issues they have in mind, and tf possible to take 
at least as many observations as appear necessary for this. But the final figure actually 
obtained for the sample size is more often determined by such factors as the amount of time 
and money available for the experiments in question, the weather, the domestic circumstances 
of the experimenter, and so forth. To give a concrete example, the present writer recently 
met Prof. Haldane and his wife in Paris, attending a Conference. While there, the Haldanes 
took the opportunity of examining all the cats they could find, for sex and coloration. No 
doubt, on their return, they will publish the data they have collected, and draw some con- 
clusions about the genetics of Parisian cats. They will not consider it relevant to their con- 
clusions to state that the total number of cats examined was not determined by them in 
advance, but consisted of all the cats they could manage to examine; nor is anyone else 
likely to think it relevant to inquire how their sample size came to be what it was. All that 
is necessary for them to state about their sampling procedure is, that the cats examined by 
them constitute a random sample from the population of cats they might have examined in 
this way, and they will, of course, recognize that any conclusions they draw will relate to this 
population of cats—and not, for example, to the population of all cats in France. Their sample 
of cats will not have been obtained by a double-sampling procedure; but no more will it 
have been obtained by a fixed-sample size procedure. And consequently one will have as 
much justification for referring their sample to a double-sampling reference set, as one will 
have in referring their sample to a fixed sample size reference set. As much justification, and 
no more; in actual fact, in the present writer’s view, neither reference set should really be 
involved in the statistical examination of their results. 

Another, apparently more weighty, argument that can be advanced in favour of holding 
the sample size constant is, that in actual fact in every experiment we do, the sample size 
will come to some value, N,, say, for the kth experiment, MN, perhaps being a number fixed 
in advance, or perhaps being determined by chance causes. If, now, we take as significant, 











206 On the Fisher-Behrens test 


for each sample size, only those results which fall in a group having probability (say) 0-05 
on the hypothesis being tested, then this means that our overall frequency of errors of the 
first kind will (by the law of large numbers) converge to 0-05, however N,, varies as k varies. 
A posteriori, as it were, we can stratify our population of experiments according to the sample 
sizes involved, and taking 5 % of each stratum* means taking 5 % of the whole. 

In the opinion of the present writer, however, this argument contains a vicious circle, in 
that it assumes what is under dispute. To see the fallacy it is necessary to state precisely the 
law of large numbers as it is being applied to the case in point. What we are saying is, in 
effect, that if we have a sequence of independent events H, (H, will here mean ‘making an 
error of the first kind’), which on a hypothesis %, have each a probability 0-05, 


Pr. {Z| 4} = 0-05 for every k, 
where k ranges over 1, 2,3, ..., N; then if N is large a result of the form 
E,.E,.E,.E,... By, 


where ‘.’ means ‘and then’, and ‘~’ means ‘not’, and where the bar occurs over just 5 % of 
the letters, is enormously more probable than a result in which the bar occurs over more or 
over less than 5 % of the letters. The point to be emphasized is that we have here an enormous 
vrobability, which must not be confused with a certainty. The essential difference between 
a probability and a certainty is, that the first involves a reference set, while the second does 
not. In fact, the reference set involved in the probability we are speaking of is that obtained 
by the conjunction of the %,: 
H,. Hy. Ay. H,. HM, ... Hy. 

Now in our case %, is not to be interpreted as meaning just the hypothesis about the dis- 
tribution of a single observation made in the kth experiment. “%, must also specify the 
reference set involved in the calculation of the probability of #,. What is assumed in the 
argument we are now discussing is, that the combined reference set to which we must refer 
the overall probability of a sequence of errors of the first kind, is one in which the particular 
combination of sample sizes actually observed, 


Ny Ng. Ng. Ng. Ns... Ny, 


is held fixed. it thus appears that the argument assumes in effect what has to be proved. If 
we were to admit that the overall reference set could sometimes involve cases where things 
other than the sample size were held fixed, then we could make the same kind of statement 
about the long run frequency of errors, for these reference sets, as we can for the reference 
sets of fixed sample size. In actual fact, of course, since the introduction and use of sequential 
tests, some of the reference sets used by the Neyman-Pearson school are those in which the 
likelihood ratio, rather than the sample size, is held fixed. And as other sequential procedures 
come into more general use, such as, in particular, Mr Ruben’s procedure for estimating the 
difference of means, statisticians who think in terms of reference sets will become accustomed 
to using them, and the reference set here proposed for the Fisher-Behrens problem will not 
appear so ‘special’. 

* Tam leaving aside the objection that this cannot always be done, due to discreteness effects—e.g. in 
the binomial problem for small sample sizes, or the 2 x 2 table. Such an objection can be met by an advo- 
cate of the Neyman-Pearson theory by introducing the idea of a ‘random decision’, where the question 
of the significance or otherwise of a given result can be made to depend on the throw of a die. In my 


view, such random decisions are absurd, in the present context—and I have yet to meet an experimenter 
who would be prepared to submit to one—but I do not wish to argue this point here. 





Yas 


aM PTF BS 





G. A. BARNARD 207 


As has already been stated, it is the view of the present writer that the arbitrary nature of 
the reference set involved, on the Neyman-Pearson theory, in a test of significance, is a 
decisive reason for rejecting that theory, as a theory of inference, in favour of using a theory 
of inference, such as that given by Fisher, where the idea of a reference set does not enter. 
It should be emphasized, however, that the Neyman-Pearson theory is an exceedingly 
valuable weapon in the advance planning of experimentation. To put the matter shortly, 
two kinds of quantity are involved in uncertain inference. The present writer has called them 
f-odds and b-odds, but Fisher has more aptly (though slightly less precisely) called them 
probability and likelihood. Probabilities are relevant before an experiment has been per- 
formed, when we are planning it. After the experiment has been performed, when we are 
drawing conclusions, likelihoods are relevant. As a theory based on probabilities, the 
Neyman-Pearson theory is useful in planning, before the result is known; but after the result 
is known, the theory of likelihood should be used. 

Since this paper was completed the author has had occasion to re-read Yate. s paper 
(1939), and the following quotation (from p. 583) seems of interest: ‘The consideration of 
what will occur in regions of fixed s (Yates’s s = our s/,/n = k, approximately) is also of 
interest when a whole group of experiments is under review. It is inevitable in practice that 
the experiments will tend to be classified according to their apparent accuracy; e.g. we may 
reject entirely experiments which have a very low apparent accuracy. If such is the case 
we shall in fact be sampling in regions of fixed s rather than fixed o.’ In this light, the present 
paper may be regarded as giving a precise interpretation to the phrase ‘sampling in regions 
of fixed s (= our k, approximately)’. 


REFERENCE 


Yates, F. (1939). An apparent inconsistency arising from tests of significance based on fiducial dis- 
tributions of unknown parameters. Proc. Camb. Phil. Soc. 35, 579. 











[ 208 ] 


THE INCOMPLETE BETA FUNCTION AS A CONTOUR INTEGRAL 
AND A QUICKLY CONVERGING SERIES FOR ITS INVERSE 


By M. E. WISE 


1. InrrRoDUCTION 


Although so much work has been done on the incomplete beta function its mathematical 
treatment can still be simplified. In particular, there is a simple but quickly converging 
expansion for its percentage points which, surprisingly, seems to have been missed. It is 
for solving in 
I -(1—t)2-1 dt 
L,(p, qg) ze . 
i, -1(1 —t)2-1 dt 





= P, (1-1) 


for either p or x, which are found in terms of percentage points of the y* distribution. The 
result was needed in a sampling problem for calculating p or x in skew distributions in which 
p>q, but it is found to be accurate even when p = q, and z is thereby obtained more easily 
than from the variance ratio ‘F’ or its logarithm z. 

The expansion is theoretically interesting in that it explains some other published approxi- 
mations that depend on x*. It is derived from a contour integral for I,(p, 7) which is expanded; 
with this approach it is easier to reverse the series, as it is not necessary to treat the numerator 
and denominator of (1-1) separately. We start instead from the incomplete negative binomial 
series and obtain also a simple proof of its well-known relation to the incomplete beta function 
and to the incomplete positive binomial series. This will be given first. 


2. INCOMPLETE BINOMIAL SERIES AND CONTOUR INTEGRALS 
FOR THE INCOMPLETE BETA FUNCTION 


(a) The sum of the first p terms of the positive binomial series 


Ina random sample of n from a population with proportion x ‘good’ and 1 ~z ‘bad’, let Q be 
the probability that 0, 1, 2,... or (p—1) are good, then 


sib ) n(n—1)...(n—p+ 2) 


(p—1)! 
==, (1- e+aa(+5 +...+ 3) a, (2:1) 


where 0 is a contour of integration passing counter-clockwise round the origin 


= sl, (l- e+e)” > 


always provided that z = 1 is outside of O. To verify that this is the incomplete beta ratio, 
we find ag 1 


- | n(1—x+2x)"-12-? dz, 
oO 


Q = (l—2z)"+n(l—2)"124+ 





(1—a)"-2a2+...4 





a1] — x) n—P +1 


—<. 





(2-2) 


dni 








Putti 


th 





cal 


1) 


+1 


io, 





M. E. WIsE 209 
Putting 2’ ote ie 


0 _ ne (1-2)? 


’\n—-1,’— >! J 
ae = = [a+e) 2'-P dz’. (2-3) 


Q = 0 when x = 1 and 1 when z = 0, so 


1 
{ -1(1 —t)"-? dt 
zx 





Q - 1 > 
| tP-1(1 —t)"-P dt 
0 


and so 1-Q = L(p,n—p+1). (2-4) 


Karl Pearson (1924) first found this result by repeated integrations by parts of the beta 
integral. 


(b) The sum of the first p terms of the negative binomial series 


Later, Pearson gave the incomplete negative binomial series, also, as an incomplete beta 
function (1933), proved by a method found by E. C. Fieller fro1 a real integral form of the 
remainder in Taylor’s expansion (see also Kendall, 1945, p. 120). We shall now find the 
corresponding contour integral. If —q is the index of the series and Q’ the sum of its first 
p terms, 

















Q = (1- oye +9240" ) 8 ., 42 a(q+ aie = (2-5) 
i =e is (2-6) 
1 1 1 —Iy-P du , 
fos ie (2-6°) 
Differentiating Q’ with respect to 1/(1—2) and then putting wu’ = zu, it is easily found that 
0Q’ _ el x) t- Ph \I~@ de! ' 
hin ed P(1—w’) Gdu’. (2 7) 
Then since Q’ = 0 when x = 1 and Q’ = 1 when x = 0, 
Q’ =1- L(p, q)- (2-8) 
Alternatively, (2-6) can be written 
Q' = a (S =a ( l-z tt du (2-6”) 
2m Jo\ l—xu 1—zxu l-u 


Now putting c- zie 





= 2, (2-6”) reduces to (2-2), provided that 
n=pt+q-—l. (2-9) 


(c) Generalization to any positive values of p and q 
When » is not an integer (¢ of course can have any positive value) we define Q’ by changing 
0+ 
the contour of (2-6) into the one usually written as | which starts from —0o—0i, goes 


—@ 


Biometrika 37 14 














210 The incomplete beta function 
round the origin in the positive sense and returns to — co + 0/; in this case it must cross the real 


0+ 
axis between 0 and 1. The corresponding path in the z’ plane (equation ‘2-3)) is { . It 
-1 


follows in the same way as before that 


0+ /1—a2u\-ty-P d. 
Q =) =55) (Fs) eka, (2-10) 





except that we have to prove that Q’(0) = 1. This is easily done by changing the contour to 
one passing to the right of vw = 1; this adds — 1 to the integral. But the new path is equivalent 
to a circle with infinite radius, for which the integral Q’(0) — 1 is clearly zero. 
There is no need to evaluate the complete beta integral; this would come from (2-3), since 
1 (n—1)! 
RS ‘\n-1 4'—p dz! = 2. 
Sri of #08 Pe = tip? > de 
when p is a whole number; the generalization to all values of p is found in text-books 
(e.g. Whittaker & Watson, 1940, Chapter 12). 





3. OUTLINE OF THE METHOD OF OBTAINING THE EXPANSIONS 


The 1: ain idea was first applied by Molina (1932) to the beta integral 
“p-\(1—t)0-2dU. (3-1) 
0 


This is to express the integrand as a product of an exponential with a large index and 
a power series, and to integrate term by term, so that the resulting series is in negative 
powers of the index. Molina obtained a general expression for all possible series obtainable 
in this way, deduced as a special case (1932) a simple and quickly converging expansion of 
(3-1) in powers of (p+ 4q—4)-*, and used it to sum a few incomplete binomial series. The 
writer found the same series without finding the general expansion first (1946, 1948), in 
ignorance of Molina’s work, and deduced the corresponding expansion for the ratio (1:1) 
(1946, equation (C12)). In order to find this expansion directly without the laborious con- 
version of integral to ratio, we start from (2-10). The integrand is factorized in the same way, 
and the convergence is rapid because the factors can be chosen so that the power series is a 
hyperbolic sine (Wise, 1946, equation (C8)). To do this, we use the identity 


1—ux = (uxt 2sinh {—}log, (ux)}, (3-2) 


and corresponding identities for 1—« and 1 —w, and put 





ux =e, 
1 sinh 4v ~¢ olp+4a-#v yp+40-4 dy 
Then 1—1,(p,q) = al. (an (- tea) 2sinh }(v+log, 2) ’ 


where c,, goes from — oo — i, across the real axis to the right of — log, z, and back to — 00 +77. 
This shows that the series is in powers of p + $q—4. Write 


N= p+4q-}3, 
y = —Nlog,2, 











real 


. It 


-10) 


r to 
lent 


ince 
11) 


0ks 


(3-1) 


and 
tive 
able 
n. of 
The 
), in 
Pit) 
con- 
way, 
sis a 


(3-2) 


+77. 





M. E. WisE 211 





and put w = N(v+log,x) = — N log, u, 
0+ ~ 
» sinh - a4 2H . evdw 
then — L(p,9) = 5 , (3-3) 
re h | 2Nsinh— 
ugh a 2N 


which has an expansion in powers of N-?, 


4. EXPANSION FOR THE INCOMPLETE BETA FUNCTION IN POWERS OF (p+4q— 4)? 


The expansion of (3-3) is 


= 1 wi w w\-7 {1 w,(y) w,{y) " 
1—L,(p,q) = eal. (1 +") pat area} dw, (4-1) 








where wy) = w(q+ 1) + 2gy, , 
w,(y) = Tw? + q(w + 2y) [4y? + yw(10g + 4) + w?(5q + 12)]. 
0+ -a 
It is convenient to write I, = 53 mil. e(1 ~ “) dw; 
ve a _ ata: — re ; 
then putting w = t—y a oni [er e'dt = q—)!’ (4-2) 


since the integral in (4:2) is Hankel’s contour integral for the reciprocal of the gamma 
function (see Whittaker & Watson, 1940, Chapter 12). 








ite (4- U ef Poy) Paly)_ 
Now 5 = T,4,, and P, = 0 when y is infinite and 1 when y is 0. These three facts prove that 
ftvere-ray 
1—Pjy) = <2—_—__ = ue. (4-4) 


0 


P, and P, are easily found by expressing w, and w, as polynomials in 1 + w/y, and the ensuing 

integrals in terms of J: yer 

Py) = let Fi ;(q@t+1i+y), (4-5) 
yte- 

Py) = SS ay (9-8) (a— 2) (89 +7) (Q +1 +y)— (59-7) ¥(q + 3+ y)}. (4-6) 


(4-3) is equivalent to the series obtained in the previous paper (Wise, 1946, equation (C 12)). 
A similar result was obtained independently by Rao (1948).* 


5. EXPANSIONS FOR THE INVERSE FUNCTION 
If y is unknown and P, which will be written for I,(p,q), is known, clearly the first approxi- 
mation to ¥, Yo say, is given by (4-4), i.e. (letting 1/N = 0) 
P=Pyly), (5-1) 
or Yo = 3Xiq(P), (5-1’) 
* I would like to thank Dr J. Wishart for drawing my attention to this work. 











212 The incomplete beta function 


i.e. half the value of y* exceéded by a proportion P of its distribution with 2q degrees of 
freedom. That is to say, if NV is unknown and z is known, the first approximation N, to N is 


_ _XialP) 
a> eye (5-2) 


and the first approximation to x for known N must be 
—~y2(P 
XY = exp [xe : (5:3) 
Thomson (1947) has found (5-3) by an interesting empirical argument, independently. 


His results will be discussed later. 
The expansion for y as a function of P is clearly, if x is the unknown, of the form 





ea 
Yo log,% 24N®* 5760N4* eA 
Similarly, if NV (i.e. p) is the unknown, we can put 
ee we e 
7 N,7 )tsamet oeowst (5-5) 
Then, since (5-4) and (5-5) are identical, 
6; =6,, 84 = d,— 2082, (5-6) 


and we only have to calculate 6, and 6,. To do this, the right-hand side of (4-1) is expanded as 
a Taylor series about y = yy. The resulting contour integrals are all expressed in terms of 
T,43 Ig42, ete., by putting w = yo( Y — 1). Then by using (4-2) they are expressed in terms of 
I,. Finally, equating coefficients of N-* and N- gives 


d, = (q—1)(¢+1+y%), (5-7) 


0, pin 2y5(4q — Yo — 8) 
(q—1)(¢+1+Yo) G+1+YH% 
and from (5-6) 





+ (13q— 21) (q¢+2)+10(g—1)(¢+4o+4), (5°8) 


34 ne 25(4q — Yo — 8) 
(g—1)(g+1+Yp) qG+1+Y 


Thus the second approximation to x is given by 


¥ _ loger _ 1, @-NGt+1+H) 





+ (13q— 21) (q+ 2)—10(g—1)(G+4o+#). (5-9) 








Yo log.% 24N2 7 on 
N (g—1)(¢+1+Yo) 
and that for NV b 2 =l+ s. 5-10’ 
af Yo N 24N5 ( 
Adding the N-‘ term multiplies the approximation to x by 
—Yo% . 
“ iors. oe 
8 





and adds to that for N the term (5-12) 








Th 





3 of 
V is 


5-4) 


0’) 


1) 





M. E. WIsE 213 
6. NUMERICAL ACCURACY OF THE APPROXIMATIONS GiVEN BY (5:10), (5-10) 
The error in x (or NV) as calculated by neglecting &, (or 34) is of interest. Since 


T_-2(q P) =1- LAP, q); (6-1) 


one can always make p > q. In order to obtain the error as accurately as possible x has therefore 
been computed in the worst cases, i.e. p = q, for comparison with some of the exact values 
published in Catherine Thompson’s (1941) tables. The values of y, were obtained from 
Thompson’s (1941) tables of y? percentage points; she mentions that linear interpolation is 


Table 1. Percentage errors in x for p = q 












































og 0-995 0-9 0-5 0-1 0-005 5 
i ‘. —2-5758 | —1-2816| 0 1-2816 | 2-5758 (N/10) 
4 55 | Yo 0-67221 | 1-74477| 3-67206| 6-6808 10-9775 0-05033 
Exact x 0-88230 | 0-72140| 0-5 0-27860 | 0-11770 
Percentage error | 0-015 0-05 0-15 0-44 1-12 
7:5 | 10-75 | Yo 1-53691 | 3-1519 56701 9-2747 | 14-1500 1-436 
Exact x 0-80275 | 0-66355 | 0-5 0-33645 | 0-19725 
Percentage error | 0-03 0-065 0-17 0-39 0-85 
15 22 Yo 6-8933 | 10-2996 | 14-6680 | 20-1280 | 26-836 51-54 
Exact x 0-72435 | 0-61634 | 0-5 0-38366 | 0-27565 
Percentage error | 0-05 0-095 0-18 0-33 0-59 
30 44-5 | Yo 1° 7673 | 23-2294 | 29-6673 |-37-1985 | 45-9758 1745 
| Exact x 0-66241 | 0-58250| 0-5 0-41750 | 0-33759 
| Percentage error | 0-075 0-12 0-18 0-28 0-42 
| 
60 | 89-5 | Yo | 41-937 | 50-317 | 59-666 | 70-110 | 81-813 | 57570 
Exact x 0-61620 | 0-55842 | 0-5 | 044158 | 0-38380 
Percentage error | 0-085 0-12 0-17 | 0-25 0-35 
| | 





Notes. The results are calculated from 


(q-1 +1+y, 
y=yoll+ q Mer o\, 
where y=—Nlog,2, yo=43xi(P), N=p+4q-}. 


For a given g and P and varying p, the percentage error is proportional to N-5. The calculated values 
of x are always larger than the exact values. 


usually accurate enough for fractional values of g. yp) can also be found from Campbell’s 
(1923) series, of which he has given eleven terms: 


Yo = 1+ Epgt + 3(Eb— 1) + gelEE— 7Ep) q+ — gpl 16 — 18> — 38h) q, (6-2) 
l o 
where Jom), (— 40?) dt = P. (6-3) 


A recent paper by Riordan (1949) on this type of reversed series is also of interest. 
Obviously for any fixed g and P the percentage error in x, from (5-11), is nearly proportional 
to (p+ 4q—})~>, ie. to N-5; that for N from (5-12) to N>*. Therefore percentage errors have 











214 The incomplete beta function 


been tabulated for p = q, and also values of N°, and of Né for P = 0-5. Thus the errors for all 
other values of p(+q) can be quickly estimated. The values of P selected for tabulation 
provide nearly equal intervals of ¢p. 

Inclusion of the fourth degree terms appears to reduce the error in x or N by at least 
90%, when p = q; obviously it will reduce it still more when p>q, since the sixth degree 
terms in (5-10) and (5-10’) must change z or N by amounts proportional to xN~’ or NNj* 
respectively, for fixed g and P. Hence 4,, 6; can be assumed to give this error. It can be 
expressed in powers of g? using (6-2). This gives 














6 P 
aah = 3{24q° + 21, gq! + (11E2— 8) q?+...}, (6-4) 
Ly 
<i = —8q°— 17£,q'+4(8 — 59%) q?+.... (6-4’) 
Table 2. Percentage errors in N 
4 
¢ | P 0-995 0-9 0-5 o1 | 0-005 | .o/l0) 
4 5-5 Ny 5:37 5:34 5-30 5-23 5-13 0-0787 
Percentage error - 0-01 0-04 0-12 0-35 
7:5 10-75 Ny 10-47 10-42 10-34 10-24 10-10 1-144 
Percentage error | —0-005 ” 0-035 0-09 0-20 
15 22 N, 21-38 21-28 21-16 21-01 20-83 20-00 
Percentage error * 0-015 0-03 0-06 0-11 
30 44-5 N, 43-14 42-98 42-80 42-59 | 42-34 336-0 
Percentage error 0-015 0-02 0-03 0-05 0-08 
60 89-5 N, 86-65 86-38 86-10 85-80 | 85-50 | 5500 
Percentage error 0-015 0-02 0-03 0-04 0-06 



































* Error is between + 0-005 %. 





Notes. The resuiis are calculated from 





| ( 1 
- q )(q+1+Yo) a Yo 
oat be Not 24N, a ~ log, «* 


The exact p is equal to g and for other values of p and a given g and P the percentage error is pro- 
portional to Nj‘. The calculated values are the larger except for g = 7:5, P = 0-995. 


Thus for a value of P for which y) + q, 6, +72q*, and then when p = q the factor multiplying 
into x is 1—(2/1215). Obviously the percentage error in z increases quickly with £, and 
therefore with 1 — P. The absolute error increases also, and if P is small and p is only slightly 
greater than g, one obtains a more accurate value of x by interchanging p and q.+ 

The exact values of x given in Table 1 are regarded as known in testing the accuracy of 
the calculation of N = N(zx,q, P) (and hence of p). p is required in many binomial sampling 
problems, such as those of the authors quoted in §§ 8 (a) and 8(b). In these problems it is often 


t Prof. E. S. Pearson pointed this out and has given some illustrations; e.g. for P = 0-05 or less, 
p and q should be interchanged when p = 12, g = 10, but not when p = 15, qg= 10. 








ok 
ac 





pro- 








M. E. WIsE 215 


useful to have also a mathematical formula for p. Obviously the error in N is extremely 
small (e.g. if yy = g = p, the N-* term adds — N/3270 to NV). Thus the formula (5-10’) can be 
used even if p<q; if p<q it would be better to interchange them and solve (4-3) for g aby 
iteration, but in most practical cases p > q. 


7. COMPARISON OF ERRORS IN X WITH THOSE FROM ‘z’ APPROXIMATIONS 


With p > 2g < 100 (say), the errors in x from (5-10) are generally much smaller than those 
obtained from Cochran’s (1932) formula.* However, Carter (1947) has recently found a more 
accurate formula of the same type, viz 


a BoE p(A+= 


— 
Ca 


d(A t é $8) » 
where : =1+ +e", A= 1 (£2 — 3) &= 
x p ? @\SP ? 


+ 
In deriving this approximation, a term rhe?) (€}+11£,) : neglected. 


For comparison, x has been calculated for p = 30, q = 15; in this case both approximations 
are good, but Carter’s is much closer than (5-10) at one end of the range of values of P, whilst 
at the other end (5-10) is slightly better: 








P 0-995 0-500 0-005 
Percentage error of x from (7:1) +0-010 + 0-006 + 0-063 
Percentage error of x from (5-10) + 0-004 +0-013 + 0-042 




















A study of Carter’s table of values of z calculated from (7-1) and compared with the exact 
values (his n, = 2g, n, = 2p) shows that in all cases his percentage error increases slowly 
with P as well as with d; that is to say, it changes with P and p/q in the opposite way to 
our errors—but more slowly. Thus doubling p has divided our error in x from (5-10) for 
p = q by 14; trebling p (i.e. p = 3q) would divide it by 70. A rough working rule is that if 
p > 3q, the x? formula is better than a ‘z’ approximation such as (7-1); when 3q > p > 2q the 

z’ formula is better for small P only, and for all P when p< 2q. Clearly if the table of 
percentage points of yx? could be extended to include more degrees of freedom, all in- 
complete beta percentage points could be calculated quickly and accurately; clearly, also, 
many of those already tabulated could have been computed much more easily. 


8. RELATION TO OTHER x?2-TYPE APPROXIMATIONS 


The expansion method provides the mathematical basis for several recently published 
approximations to the inverse function that depend on y?. 

(a) Scheffé & Tukey (1944) have given a simple approximation to p that is valid when 
x is between 0-9 and 1. It seems from a later comment (Birnbaum & Zukerman, 1949) that 


* See also Thomson’s table (1947, p. 371) under (D). 











216 The incomplete beta function 


it has not been proved mathematically. Their sample size n = p + ¢— 1 asin (2-1); thus in our 
notation N = n—4q+43, and their formula is 


N = y(;~,-3). (8:1) 


Thus N > 104, so that the error in N from (5-10’) is negligible; it is of the order of 1-5 x 10-"N. 
Expand (5-10’) in powers of 1—a = 2’ (say); this gives 


ES See ee (g—1)(g+1+4)\_ 2 10(g— 1) (g+1+4Y) 








0 

(8-2) 
and the error in the Scheffé-Tukey approximation is y) times the right-hand side of (8-2). 
This explains why they find that the error is less than 1 % when 1—2x< 0-1. To estimate this 
error more precisely we substitute Campbell’s expansion (6-2) for y. This gives 


gaat H) of) 36 Ea 
7; ct 








8-3) 
Og 39 (8-3) 


Sen +44 =— (fa’ + pgx’?) {Eq-* — 4(10€2 + 2)q-+2'8 Yee gq t+ a he . (8°4) 
Yo x 2 bd L 720 24 108 
This is of interest also in explaining their second: approximation to N which follows. 


(6) MacCarthy (1947) has quoted « new approximation due to Scheffé & Tukey. In our 
notation it is 


ree 9+ (Yoga 


er (8-5) 


This looks very different from (8-1) until it is expanded in powers of 1—2z, when one gets 


eG Ee ee 








———+-— =(= — 1) (da’ +- ew’? 4+- hg’? + ....). 8-6 
ie 3 \ue Ks: Tet" +338 ) (8-6) 
Substituting (6-2) for yp, 
ga Pah , , , 
y #2 = — (fa' + gv’? + ph qx"®) (Eq-* — 3 (287 + 1)q7}, (8-7) 
N\ _(N\ _ (2u' +2") (1—-46")_ f 1 86 as | . 
7 yg Nemes "mpi sage 3 hb 


Comparing this with the error (8-4) in their first approximation, the term in q+, which is 
the largest unless P is very near to }, has been almost eliminated. 

The remaining formulae are approximations to « and have been discussed by Thomson 
(1947). 

(c) As already mentioned, the first approximation to z, i.e. eo), corresponds to D.Halton 
Thomson’s approximation (A); he has discussed and tabulated its error for p = 60 and 15 and 
various values of q, for P = 0-995, 0-5 and 0-005. This is therefore a table of values of 


100(g— 1) (¢+1+Yo) Yo 
3(2p +q—1)8 ; 


when x < }; for x > } his tabulated percentages are x/(1—x) times this amount. 





(8-9) 








a aa ae * ee 


our 


3-3) 


+4) 


Jur 











M. E. WIsE 217 


(d@) Thomson’s approximation (B) to z is 








sg Yo/@ 
aie (2-4 | (8-10) 
This may be written as 
—s~ a eT 7 ’ 
ge at Gall) 


Comparing this with (5-10) the error of (8-11) is nearly equal to 


g (@q- 1) (G+ 1+Yo) 
12N2 24N2 





(8-12) 


and so is positive when P is near 1, negative when P is near zero, and small when P = } and 
Yo = 7 as found by Thomson. 

(e) Campbell (1923) found the expansion for x in powers of n = p+q—1. He interpreted 
his first approximation to x as follows: Starting from the other end of the binomial series 
(2-1), the probability that there are q— 1 or less ‘bad’ in the sample of n is the sum of the first 
q terms, and, as is well known, if the average number of ‘bad’ in a sample of 7 is a and 
remains fixed whilst noo, then x = 1—a/n and 





Preltat St. + coil - 1-70, (8°13) 


So as n>0o, a->y, and can be expanded in powers of n-". In our notation, his 


A=%q-1-y%) (¢=4), 
and his expansion is 


me hee A 144? + (3¥o+2)A+Yo 
= n(1—2x) = yl 45+ 122 


3642 + (20yy+ 12) A?+ (8y8 + 8yo) A +98 +..| (8-14) 
24n38 op 


This series is obtained from (5-10) by expanding x in powers of n-1(N = n—4q+}4). The term 
inn-* has been obtained in Campbell’s series by Riordan.* It is not obvious why one should 
instead get a better formula by expanding log, x in powers of p+4q—4; had it been so, the 
new series might have been found 27 years ago. 








SUMMARY 


Expansions for percentage points of the incomplete beta function satisfying [,(p,q) = P, for 
either p or x, have been found in terms of percentage points of the x? distribution. They are 
derived from a new contour integral for [,(p,q) which is shown to be related to contour 
integrals for incomplete positive and negative binomial series. The latter integrals provide 
simple proofs of the known fact that these series are incomplete beta functions. The resulting 
approximations are good over all values of P, g and p or 2, unless (for p unknown) p <q, 
and they are extremely good for small 1—P and/or small g/p. They have been compared 


* Unpublished. I would like to thank Mr Riordan also for informing me of the work by Scheffé & 
Tukey and MacCarthy. 











218 The incomplete beta function 


(i) numerically with exact tabulated values and with ‘z’ approximations, and (ii) mathe- 


matically with some recently published approximations depending on x?, which are thereby 
provided with a theoretical basis. 


The author thanks Mr G. Klein (Birkbeck College, University of London) for a helpful 
mathematical discussion. The work was done while he was in the Physics Department of 
Birkbeck College, and he is indebted to his employers in the Philips Companies,* and in 
particular to Dr J. A. M. van Moll, Head of the Material Research Laboratory, Mitcham, 
Surrey, for their support during this period. 


REFERENCES 


BrensauM, Z. W. & ZuKERMAN, H. S. (1949). A graphical determination of sample size for Wilks’s 
tolerance limits. Ann. Math. Statist. 20, 313. 

CaMPBELL, G. A. (1923). Probability curves showing Poisson’s exponential summation. Bell Syst. 
Tech. J. 2, 97. 

Carter, A. H. (1947). Approximation to percentage points of the z distribution. Biometrika, 34, 352. 

Coonran, W. G. (1932). Note on an approximate formula for significance levels of z. Ann. Math. 
Statist. 11, 93. 

Kenpatt, M. G. (1945). Advanced Theory of Statistics, 2nd ed., vol. 1. London: Griffin. 

MacCartay, Pamir J. (1947). Approximate solutions for means and variances in a certain class of 
box problems. Ann. Math. Statist. 18, 382. 

Motina, Epwarp C. (1932). Expansion for Laplacien integrals in terms of incomplete gamma 
functions. Bell Syst. Tech. J. 11, 563, and Monograph B704, Appendix IT. 

PEARSON, Kar (1924). Note on the relationship of the incomplete beta function to the sum of the 
first » terms of the binomial (a+ 6)". Biometrika, 16, 202. 

PEARSON, Karu & FIELLER, E. C. (1983). On the applications of the double Bessel function x,,,,(z) to 
statistical problems. Biometrika, \25, 158. 

Rao, C. RaDHAKRISHNA (1948). Testi} of significance in multivariate analysis. Biometrika, 35, 58. 

RioRDAN, JOHN (1949). Inversion formulas in normal variable mapping. Ann. Math. Statist. 20, 417. 

Scuerrt, H. & Tuxry, J. W. (1944). (A formula for sample sizes for population tolerance limits. Ann. 
Math. Statist. 15, 217. 

TxHompson, CATHERINE M. (1941). Tables of percentage points of the incomplete beta function and of 
the x? distribution. Biometrika, 42, 151. 

Tuomson, D. Hatron (1947). Appripximate formulae for the percentage points of the incomplete 
beta function and of the x? distripution. Biometrika, 34, 368. 

Wairtaker, E, T. & Watson, G. N. (1940). Modern Analysis, 4th ed. Cambridge University Press. 

Wisz, M. E. (1946). The use of the nigative binomial distribution in an industrial sampling problem. 
Suppl. J. R. Statist. Soc, 8, 202. 


Wise, M. E. (1948). The incomplete beta function and the incomplete gamma function. An acknowledge- 
ment. J. R. Statist. Soc. B, 10, 264. 





* He is now employed by the HWhilips Research Laboratories, Eindhoven (Netherlands). 









































[ 219 ] 


ON THE LEVELS OF SIGNIFICANCE OF THE iNCOMPLETE 
BETA FUNCTION AND THE F-DISTRIBUTIONS 


By LEO A. AROIAN, Hughes Aircraft Company 


Approximate formulae for the levels of significance of the incomplete beta function 


1,(Ps9) = BaP 9)|B(p,9) = [2°41 —2)1de]B(p,9) (0<a<1,7,q>0), (1) 
and also of the F distribution 
nin nit [Fo Fin-1 
B(dny, $M) J 0 (my F + nq)k +" 








Tp, (1; %-) = dF (0<K,<; n,,n.>0) (2) 
are given. The result for the incomplete beta function is based on the Cornish-Fisher formula, 
but for the F distribution a previous formula (Aroian, 1947) is slightly modified. It is, of 
course, well known that the F distribution is transformed to the beta distribution by putting 
F = qu/{p(1—2)}, and in the reverse direction by putting z = n,F/(n,F +7), p = 4, 
q = 4n,. Similarly, the F distribution is transformed to Fisher’s z distribution by putting 
z= flog, F (—2w<z<00; ,,n,>0). The advantage of the present formulae is that the 
particular x, or F, may be calculated directly without reference to Fisher’s z, orto a table 
of exponential functions. For Fj), we recommend n,, n,2 24, and for x», p, g212. Some 
numerical tables indicate the accuracy of the approximations for different values of 
P = 0-90, 0-95, 0-975, 0-99, 0-995, and in some instances 0-999, where P is defined by 


L,(P,4)=P or Ip,(m,) = P. (3) 
We note T-2(9,P) =1—-P and L[yp,(ng,n,) = 1—P. (4) 
The Cornish-Fisher formula may be written in standard units (Carter, 1947), 


Ly— ™. } 
ty, = a Zw A, +Agtg,,+Ag(%q:~—3) + Agog. ., 





(5) 
a = 3 
gon as?) gid Mg, 


where 2, is the value sought, m, is the mean of the distribution, 7, the standard deviation, 
&3:, the third central moment divided by 0%, «,,, the fourth central moment divided by 
o4, and é is determined by 1 fet 
real edt = P. 
J (27) ee) 


For the beta distribution, writing r = p+q, m, = p/r, 02 = pq/{r*(r + 1)}, 


_ 2(¢—P) (=) we, BL Mr) 1 + 2 {(1/p + 1/q) — 3/7} 
Sit 4 8 pq }’ aie ~ (1+ 2/r)(1+3/r) 


The calculations are straightforward and simple. The method was suggested by Carter’s 
paper on the z distribution (1947). Table 1 gives the values of A,, Az, As, A, for the various 
values of P. 











220 


Levels of significance of the incomplete beta function 


























Table 1 
P A, A, Ay A, 

0-90 1-281552 0-107063 — 0-0724944 + 0-:0610606 
0-95 1-644854 0-2842575 — 0-0201807 — 0-:0187828 
0-975 1-959964 0-4735765 0-0687179 — 0-146067 
0-99 2-326348 0-7353158 0-233788 — 0-376338 
0-995 2-575829 0-9391492 0-390119 —0-591710 
0-999 3-090232 1-424922 0-843316 — 1-21026 





In Table 2, the accuracy 


of x) as determined by (5) may be seen for selected values of 
































p,q and P. 
Table 2.* Percentage levels for x 
| | 
| 
0-90 0-95 0-975 0-99 0-995 0-999 
P»4 
| 
| 
3, 3 0-751 | 0-813 0-861 0-910 0-938 0-978 
0-753 0-811 0-853 0-894 0-917 0-952 
3, 12 0-336 0-383 0-425" 0-4745 0-509 0-5807 
0-337 0-385 0-428 0-478 0-512 0-5811 
6, 3 0-856 0-895 0-924 0-950 0 963 0-977 
0-853 0-889 0-915 0-939 0-9525 0-973 
6, 12 0-477 0-521 0-558 0-6017 0-6308 0-689 
0-478 0-522 0-560 0-6025 0-6310 0-687 
12, 30 0-3765 0-4045 0-4291 0-4580 0:4778 0-5188 
0-3767 0-4648 0-4294 0-4583 0-4781 0-5187 
20, 60 0-31307 0-3325 0-3497 0-3700 0-3840 0-4132 
0-31312 0-3326 0-3498 0-3701 0-3841 — 
60, 20 0°8105 0-8256 0-8381 0-8521 0-8611 0-8785 
| 08104 0-8255 0-8380 0-8519 0-8609 | — | 
J 








* First value in cell is approximate value of x, by (5). Second value in cell is exact value of 2 given by 
the tables of Thompson (1941) or of Fisher & Yates (1948) or by the method of Thomson (1947). 


Previously (Aroian, 1947) formulae were given for the F distribution for n, and n, large: 


Foos = Mp + Op(1-64485 + 0-28392cr5, » — 0-04902c2. -), 








Fogo = My + Op( 232635 + 0-73330a,, » — 0-024957a2. 7), 
Fyoo9 = My + 7 p(3-0903 + 1419005. » + 0-0566702. »), 





2 —2 
My = Nq/(Ng—2), Cpr= {ralin—29}, [7 E—) ‘ 
a. = 42m +M—2) (My — 4) ) 
iF 14(y— 6) 2(n,+N_—2)) 


The 
by 











ues of 








Leo A. ARCIAN 221 


The present set, n,, 2. = 24, in error by less than 7 in the fourth significant figure, and modified 
by a least squares process is 


Frys = My + Op(1-64485 + 0-28392cr5. 7 — 0-04902a2, » — 0-02987a3. ,),) 
Fyors = My + Op(1-95996 + 0-47228ar9. 4p — 0-04304a2. »— 0-0310803. ), 
Fyog = Mp + 0 p(2°32635 + 0-73330a5. - + 0-03043a2. pp — 0-0607 103. n), 
Frogs = Mp + Fp(2°57583 + 0-93600a,. + 0-09526c2. » — 0-0738003. 7). 


(6) 


An example will illustrate the method, which is quite general and particularly useful where 
asymptotic formulae are difficult to obtain by analytic or other mathematical processes. Let 
Foss 


“eT EE = 2-32635 + 0°733300,, p+ Aad. »+ Bad. p. 
F 


to.o9 = 


We determine A and B by least squares leading to the equations 
Lat — 2-32635La2 — 0-73330L03 = ALas + BXa}, 
La§t — 2-32635Xa3 — 0-73330La4 = ALa§ + Bia§ 


(the subscript F is omitted in «,.,), where ¢ is obtained from the exact values of F 
(Merrington & Thompson 1944) over the set 


(24,24), (24,30), (24,60), (60,24), (24,120), (120,24), (24,00), (00,24), (60, 60) 
(24,40), (60,120), (120,60), (60,00), (00,60), (120,120), (120,00), (00, 120). 


Actually, for better asymptotic results, more values of F should be taken with values of 
&3 S 1. Too much weight has been given to large values of «,. Hence these formulae are not 
the best possible. It is apparent that, for the success of such a method, a certain number of 
exact values are needed and also some idea of the limiting distribution for n, and n, large. 
Table 3 compares the exact results with the approximate results by the new set, by Paulson’s 
formula, and by Carter’s formula. Table 3 is only a small selection of very extensive tests of 
all three formulae. 

Paulson (1942) has given a formula for F which is quite accurate, although somewhat more 


tedious to calculate: FE = {—£ + \(?—4ay)}] (2x), 
where A= 2/(9N-), B= 2/(9n,), 


a@=(1—A)?-£A, @=-2(1—A)(1-B), y= (1-—B)?—27B. 


Perhaps it should be mentioned that Paulson’s formula gives two answers, F'(n,,.) and 
F (n,n), Simultaneously. Carter (1947) has given a formula for z = } log, F, 


Hae) Coty) 





&8&= 








1 . 1 2 
and from Table 3 it may be seen that this is on the whole the most accurate formula in the 
range ”,, N, = 24 for F. However, tables of the exponential function are needed. 








222 Levels of significance of the incomplete beta function 


Table 3. Comparison of approximations to certain percentage levels for F* 


























P 

0-95 0-975 0-99 0-995 
™m), Ng 

24, 24 1-9796 2-2677 2-6582 2-9683 
1-9840 2-2708 2-6636 2-9750 
1-9842 2-2701 2-6607 2-9691 
1-9838 2-2693 2-6591 2-9667 
24, 30 1-8874 2-1349 2-4685 2-7273 
1-8874 2-1366 2-4715 2-7319 
1-8868 2-1361 2-4695 2-7279 
1-8874 2-1359 2-4689 27272 
24, 120 1-6078 1-7576 1-9506 2-0919 
1-6082 1-7598 1-9507 2-0904 
1-6091 1-7602 1-9500 2-0886 
1-6084 1-7597 1-9500 2-0890 
24, co 1-5158 1-6386 1-7934 1-9046 
1-5170 1-6403 1-7918 1-9002 
1-5185 1-6412 1-7911 1-8980 
1-5173 1-6402 1-7908 1-8983 
60, 120 1-4296 1-5291 1-6552 1-7462 
1-4290 1-5300 1-6558 1-7472 
1-4290 1-5299 1-6557 1-7468 
1-4290 1-5299 1-6557 1-7469 
120, 24 1-7913 2-0116 2-3108 2-5464 
1-7900 2-0119 2-3158 2-5567 
1-7919 2-0129 2-3134 2-5497 
1-7897 2-0099 2-3099 2-5463 
120, 60 1-4691 1-5807 1-7252 1-8314 
1-4673 1-5813 1-7270 1-8352 
1-4672 1-5813 1-7266 1-8344 
1-4673 1-5810 1-7263 1-8341 
120, co 1-2213 1-2683 1-3249 1-3644 
1-2214 1-2684 1-3247 1-3639 
1-2215 1-2684 1-3246 1-3636 
1-2214 1-2684 1-3246 1-3637 
oo, 24 1-7340 1-9379 2-2116 2-4276 
1-7334 1-9375 2-2172 2-4392 
1-7362 1-9391 2-2148 2-4313 
1-7331 1-9353 2-2107 2-4276 
oo, 120 1-2543 1-3101 1-3798 1-4298 
1-2539 1-3105 1-3807 1-4315 
1-2540 1-3106 1-3807 1-4312 
1-2539 1-3104 1-3805 1-4311 








* First value by equation (6). Second value, Paulson’s formula. Third value, Carter’s formula. 


Fourth zalue, exact result. 








= —i—_—_— ia so! 


as ee e”ChUh 


nula, 





Lro A. AROIAN - 223 


For the incomplete beta function, if p is large and g small, approximate formulae have been 
given by Thomson (1947). We expect in the future to apply the second method in this case 


by assuming t = ¥ (A,o$)+ ¥ (Bj), since all higher moments ag, K=5 are determined 
i=0 j=0 : 


by a, and a,. For (7, q) in the range of this paper the Scheffé-Tukey formula (1944) (see also 
Murphy, 1948) is inappropriate, as is the Cornish-Fisher expansion for the F distribution. 


REFERENCES 


Aroran, L. A. (1947). Biometrika, 34, 359. 

CarTER, A. H. (1947). Biometrika, 34, 352. 

FisHer, R. A. & Yates, F. (1948). Statistical Tables for Biological, Agricultural, and Medical Research. 
Edinburgh: Oliver and Boyd. 

MERRINGTON, MAXINE & THOMPSON, CATHERINE M. (1944). Biometrika, 33, 73. 

Murpry, R. B. (1948). Ann. Math. Statist. 19, 589. 

Pautson, E. (1942). Ann. Math. Statist. 13, 233. 

Scuerrt, H. & Tukey, J. W. (1944). Ann. Math. Statist. 15, 217. 

THOMPSON, CATHERINE M. (1941). Biometrika, 32, 168. 

Tuomson, D. H. (1947). Biometrika, 34, 368. 











[ 224 ] 


ON THE GENERALIZED SECOND LIMIT-THEOREM 
IN THE CALCULUS OF PROBABILITIES 


By K. 8. RAO, Department of Statistics, Bombay University 
AND DAVID G. KENDALL, Magdalen College, University of Oxford 


1. Introduction. The generalized second limit-theorem in the calculus of probabilities 
is concerned with the relationship between the convergence of a sequence of distribution 
functions {#'™(zx)} to a limit-function G(x) and the convergence of the associated moments* 


npr” widF™\x) (7 = 0,1,2,...), (1) 


to limits {A,}, as n tends to infinity. It enables one, in many cases, to answer such questions 
as the following: Is G(x) itself a distribution function? Is {A,;} a moment-sequence? Does 
G(x) possess finite moments, and are these identical with the A;? 

The importance of the theorem, in justifying many of the formal operations of statisticians, 
is obvious; this is particularly true of its classical form (Tchebycheff, Markoff), in which only 
the conditions for an approach to normality are considered. The proof, however, seems 
generally to have been considered too difficult for the student; thus in M. G. Kendall’s 
Advanced Theory of Statistics it is only presented with some apology, and from H. Cramér’s 
Mathematical Methods of Statistics it is omitted altogether. The purpose of this essay is to 
place a simple and attractive proof within the reach of every mathematical undergraduate. 
Several new devices have been employed, but these will not be specially indicated. In a 
number of details we have followed quite closely the original treatments by A. Wintner 
(1928) and M. Fréchet & J. Shohat (1931). 


2. Lemmas concerning distribution functions. 

(i) Every infinite sequence of distribution functions {F™(x)} contains a subsequence which 
converges to a bounded monotone-increasing function G(x) at all its points of continuity. This 
limit-function can be taken to be continuous-to-the-right. 

The proof of this lemma, which depends on the classical ‘diagonal process’, will be found 
in Cramér’s book (§ 6-8). It should be recalled that G(x), as a bounded monotone function, 
can have at the most a countable infinity of points of discontinuity. 

(ii) Let p{” be the second moment of the nth member of the sequence of distributions referred to 
in Lemma (i). Then the condition 

p<A,<oo (for all n>N,), (2) 
ensures that G(x) is itself a distribution function. 

For by the Tchebycheff inequality 

P(—X)+[1—F(X)] < A,/X* (n>), 
and so, if X is sufficiently large, 
FO —X)+[1—F™(X)] <e 
uniformly for all n > N,, and G(-—-X)+[1-G(X)]<e 


* Throughout this paper 4{” will denote a moment about the origin, x = 0. 


ilities 
ution 
ents* 


(1) 


stions 
Does 


cians, 
1 only 
seems 
dall’s 
mér’s 
ris to 
luate. 

In a 
ntner 


which 
This 


found 
ction, 


‘red to 


(2) 


K. 8. Rao anp Davin G. KENDALL 225 


if (as can evidently be arranged) G(zx) is continuous at x = + X. Thus G(—0o) = 0, G(oo) = 1, 
and G(z) is a distribution function. The simple example, 

F(x) =}3[1+sgn(x—n)], (3) 
makes it clear that a condition such as (2) is essential in Lemma (ii). 

(iii) If the sequence of distribution functions {F™(x)} converges to the distribution function 
G(x) at all its points of continuity, then the convergerce is uniform in every closed interval-of- 
continuity of G(x). 

Let G(x) be continuous in a <x <6, and let the scale of subdivision 

& = Cy) <0, <...<q, =5b 
be so chosen that the increase of G(x) in each interval (c,<2<c,,,) is at most ¢. Choose M 
so large that | F™c,)—G(c,)|<e (r= 0,1,2,...,k) 
whenever n exceeds M. Then for any such , and for any z in the interval (a,b), there will 
exist an r such that 


C, <2 Cr44 
and G(x) — 2 < G(c,) —€< F™(c,) < F(x) < F™(C,..1) < G(6,43) + € < G(x) + 2e, 
so that | F(x) — G(x) | < 2e, 


this inequality being true for all values of z in (a, b) provided only that n> M. The result is 
therefore established. 

(iv) If, in Lemma (iii), G(x) is a continuous distribution function, then the convergence is 
uniform throughout the whole interval — 00 < x < 00. 

Let X be chosen so large that . 

G(— X)+[1-—G(X)]<e, 
and let M’ be such that 
| F™(—X)—G(—X)|<e and | F(X)—G(X)|<e 

for all values of n greater than M’. If we now identify the interval (— X, X) with the interval 
(a, b) of the previous lemma, a simple continuation of the argument employed there shows 
that, for every x, | F(a) — G(x) | < 3¢, 


provided only that nm >max(M, M’). 


3. Lemmas concerning characteristic functions. 


(i) If the 2mth moment, pom, of the distribution function F(z) is finite, then its characteristic 
function, a 
C(t)= is eit dF (zx), 





, aml (st) (it)?™ 
can be expressed in the form C(t) = zat at Me e+ P my Mam (4) 
for all real values of t, the coefficient p denoting a complex number such that |p| <1. Also 
lim p =1, (4a) 


t—>0 
Statements roughly equivalent to this occur at several points in Cramér’s book, but the 
proof (which is not quite obvious) is not given there. The lemma is trivially true when 
m = 0; in what follows we shall therefore suppose that m > 0. With the aid of the identity 
” (Uy ul (et s— 5 a, 
_— -_, s=0 
Biometrika 37 15 











226 Second limit-theorem in the calculus of probabilities 


and an inductive argument, it is quite easy to show that 


‘ ® (txt)? —_, (tat)e+1 
i = 4, a 38 (n+1)! 
where | p’ | < 1 if z and tare both real. Substitution in the integral formula for C(t) now gives 
the first part of (i). It is convenient (and of course permissible) to let p’ = 1 when either of 
2 and 2 is zero, and to let p = 1 when either of ¢ and j,,, is zero. The expansion (4) can be taken 
to an odd power of t if the corresponding absolute moment exists, and is inserted in the last 
term, but of course in this case p does not necessarily tend to unity when ¢> 0. 
In the formula for e™ it is easy to show that 


|p’ —1| <| (at) |/(n+2), 
so that p’> 1 as t+ 0, uniformly in any finite x-interval. Also one can write 


(0-1) Ham =|" — 1)?" dF(z), 








(n = 0,1, 2,...), 


x 
and so [P—1| ton <e+ | |p’ -1|2?"dF (x), 
-x 


if X is large enough (by the convergence of the integral defining .,,). But for fixed X the 
integral over (— X, X) can be made less than ¢ by choosing ¢ to be sufficiently close to zero 
(because | p’—1| then tends to zero, uniformly in the range of integration). Thus 


|P—1| Ham < 2€ 
for all sufficiently small values of | ¢|, and so* p> 1 when t+ 0. 
(ii) Suppose that, for some positive integer m, 
2m (it)? 
C(t) = yA, + 00), (5) 
s=0 8: 
when t tends to zero through real values. Then the first 2m moments of the distribution exist, and 
they are identical with the coefficients A,. 
(Cf. § 10-1 of Cramér’s book, and also a paper by A. Zygmund (1947).) In the enunciation 
of the lemma the term o(¢?”) is to be interpreted in the strict sense; it could be written, 
more explicitly, as #2mg(t), 


where g(t) is a function which tends to zero with t. Let 6? denote the operation of forming 
a symmetrical second difference, so that 


62A(t) = A(t+ 2h) + A(t— 2h) —2A(t), 


and indeed é=(E-E-)2, 
if Z is the usual ‘shift’ operator, the application of which transforms f(t) into f(t+/). Let also 
. {d?\P 
_ (H-E-1\ 
= tim (“Se—) 40 


if the limit exists; this is the symmetrical derivate of A(t), of order 2p. For polynomials, the 
symmetrical derivate is equal to the ordinary differential coefficient of the same order, while 
a little calculation shows that, when ¢ = 0, 


62 m Qm y : “4 
(53) [t2g(t)] = zens — 1) (m—j)™ g(2mh — 2jh), 


* There is no difficulty when 42, = 0, for then p=1, by definition. 


gives 
ler of 
aken 
> last 


Y the 
zero 


(5) 


and 


‘tion 
‘ten, 


ning 


also 


the 
hile 


K. 8. Rao anp Davin G. KENDALL 227 


62 \m 2m ; n 
so that (aa) [mathe < Ec; |m— j\?™e, when |h|< re 
if | g(t) | <e for all values of ¢ satisfying |¢| <7. Thus, if C(t) has the property described in 
the lemma, Don C(t) = (—1)™Aq, when t=0. 

On the other hand, d? ett — —(2sin zh)* e™, 

and so it follows that lim ‘ (“R =)" dF (x) = Asm: 


h>0 J — 





From the uniform convergence (to unity) of 
(S = 
xh 


over the finite interval (a, b), it follows that 





b 
i) x®™ d F(x) < Am, 


and so, this being true for all finite values of a and b, the moment ,,, must be finite. Thus, 
by Lemma 3 (i), it it 
C(t) = x nt t+ ol oe yi“am 





as ¢ tends to zero, and a comparison of the two asymptotic expressions for C(t) now makes it 
clear that fg=A, (8 = 0,1,2,..., 2m). 


4. The fundamental theorem. It is now possible to prove the following result: 


A. Let the nth member of the sequence of distribution functions {F'™(x)} possess a finite 
moment of the jth order, for all n> N;(j = 0,1, 2, ...); let the limits 


lim a HS” = ae A; 


exist for every value of j ; and let the sequence {F™(x)} converge to a limit-function G(x) (bounded, 
monotone-increasing, and continuous-to-the-right) at all its points of continuity. Then 

(a) G(x) is a distribution function possessing moments of all orders ; 

(b) {Aj} is a moment-sequence ; and 

(c) A; ts the jth moment of G(x). 

First, as in the proof of Lemma 2 (ii), one can show that G(x) is itself a distribution function, 
for the second moments y{” form a convergent (and so bounded) sequence. Thus, by the first 
limit-theorem of the calculus of probabilities, the sequence of characteristic functions 
{C™(t)} converges for every real value of ¢ to I(t), the characteristic function of G(x). 

Now, wh > Nom 
~~ Nom C(t) = > ee (n) Pn pS, 

Py at +P amyl” 
and so the last term on the right must approach a limit when n tends to infinity, for every 
other term does so (and this is true for every real value of t). Thus, for m = 0,1, 2,..., 


"S -1 
rit) = 5 A +R yi aes 





15-2 











228 Second limit-theorem in the calculus of probabilities 


where £ is a complex number such that | R | < 1. (As before, it is convenient, and permissible, 
to define R to be equal to | if ¢ or A,,, is zero.) By considering each of these formulae in 
association with the one corresponding to the next-following value of m, it is easy to show that 


lim R = 1. 
t>0 


The results claimed in the theorem now follow from Lemma 3 (ii). 
It is important to notice that (with the stated conditions) the theorem is still true even if, 
for each fixed value of n, a complete sequence of moments for F(z) does not exist. 


5. The second limit-theorem (direct form). 
B. Let the sequence of distribution functions {F™(x)} converge to a limit-distribution G(x) 
at all its points of continuity, and let the moments w$ exist when n > N; and be such that 
|u| <A;<0 (j= 0,1,2,...). (6) 
Then all the moments A; of G(x) exist, and 
wyy—>aA; (jf = 0, 1,2,...), 
as N->0O. 
As in §4, the sequence of characteristic functions C™(t) converges for every real value of t 
to I(t), the characteristic function of G(x), and also 
C(t) = 1+ itu+O(#) (n>M,), 
where the constant in the inequality implied by the last term on the right-hand side can be 
taken to be independent both of ¢ and of n. It follows that 
| wi —H | <Ag|t| +n, alt) 


for every non-zero t, where 4,, ,(¢) tends to zero as m and n tend independently to infinity. 
But this cannot be true unless lim lag — | ata 
m,n->@o 
and so there exists a constant A, such that 

lim pf” = A,. 


n> 


The next stage of the argument depends on the expression 
C(t) = 1+ itu + 4(it)P w+ O(®) (n>), 


for the characteristic function of the nth distribution. (This is a consequence of a remark 
attached to the proof of Lemma 3 (i), the boundedness of the absolute third moments following 
by familiar inequalities from the boundedness of the sixth moments.) It now follows that 


| wg” — wy? | < ZAb] t] +8, a(t) 
for every non-zero t, and as before this implies the existence of 


lim pf? = Ag. 

no 
In this way the existence of the complete sequence of limits {A,} can be demonstrated in- 
ductively, and it only remains to be shown that A; is the jth moment of G(x) for every j. 
This fact, however, follows at once from what we have called the fundamental theorem 
(stated and proved here in § 4). 


) 


) 


ofr; se > te os 


-_ Se fe | fSlCUllO 


ible, 
e in 
that 


n if, 


> of t 


in be 


nity. 


mark 
wing 
hat 


d in- 
ery j. 
orem 


K. 8. Rao anp Davin G. KENDALL 229 


6. The second limit-theorem (converse form). 
©. Let the moments pw exist when n > N;(j = 0,1, 2, ...), and let the limits 


lim pf? = A; 

n—> © 
exist and be the moments of a UNIQUE distribution G(x). Then the sequence of distribution 
functions {F™(x)} converges to G(x) at all its points of continuity. 
(By Lemmas 2 (iii) and (iv) it then follows that the convergence to G(x) is uniform in every 
closed interval-of-continuity, and that it is uniform throughout the whole range —0 <x <0 
if it happens that G(x) is a continuous function.) 

In the first place it is clear that a subsequence of distribution functions can be selected 
which converges to a limit-distribution H(x) (say) at all its points of continuity. (This is 
a consequence of Lemmas 2 (i) and (ii).) It then follows from Theorem A that H(x) must be 
the unique distribution having the A; as its moments, and so H(x)=G(z). 

Now let x = a be any point of continuity of G(x), and suppose that (if possible) 

a= lim F(a) +G(a). 
n> 

We can choose a subsequence of distribution functions converging to the limit-value a at 
x =a, and the process described in the proofs of Lemmas 2 (i) and (ii) can then be applied 
again to yield a sub-subsequence which at x = a converges both to « (by previous arrange- 
ment) and also to G(a) because G(x) is continuous at x = a. This and another similar con- 
tradiction can be avoided if and only if 

lim F(a) = G(a) = lim F(a) 

n> waiasibas 
for every point of continuity of the G-distribution. Theorem C has thus been proved. 

It will be noticed that the first part of the above argument can be arranged to prove that 
the limit of a moment-sequence is always a moment-sequence. The hypothesis essential to the 
proof of the second limit-theorem is that the A; (which must in any case be the moments of 
some distribution) are in fact the moments of one distribution alone. 


7. The second limit-theorem (classical form). The classical form of the second limit- 
theorem was concerned with the conditions for an approach to a limiting distribution of the 
normal (or Gaussian) type. The analogue of Theorem C is the one most useful in practice, 
and its deduction from Theorem C depends on the fact that the normal distribution is 
uniquely characterized by its moment-sequence. This last fact can be deduced (see, for 
example, Cramér, § 15-4) from the existence of a power-series expansion with a non-zero 
radius of convergence for e~*”’, the characteristic function of the normal distribution. One 
thus obtains: 


D. If the moments pS” exist for n> N,(j = 0,1,2,...), if 


and if all the odd moments tend to zero as n tends to infinity, then 


lim F(x) = Te). et" du 


uniformly in —00 <2 <00. 











230 Second limit-theorem in the calculus of probabilities 


Of course the direct converse of Theorem D is false, unless some such condition as (6) is 
added. An illustration of this is provided by the example 


1) 1 f@ , 1 
FO (x)= (1 _ *) Jen) [eo du+ > {1 +sgn(x—n)}, (7) 


the higher moments of which all tend to infinity with n. 


REFERENCES 


Crameir, H. (1946). Mathematical Methods of Statistics. Princeton. 

Fréouet, M. & Sxouart, J. (1931). A proof of the generalised second limit-theorem in the theory of 
probability. Trans. Amer. Math. Soc, 33, 533-43. 

Kenpatt, M. G. (1943). The Advanced Theory of Statistics, 1. London. 

Wivrner, A. (1928). Uber den Konvergenzbegriff der mathematischen Statistik. Math. Z. 28, 476-80. 

Zyomunp, A. (1947). A remark on characteristic functions. Ann. Math. Statist. 18, 272-6. 


[Note added in proof. The following additional references are relevant to the lemmas proved here in §3. 
Forret, R. (1944). Calcul des moments d’une fonction de répartition 4 partir de sa caractéristique. 
Bull. Sci. Math, (2), 68, 117-31. 
Lanpav, E. (1916). Uber einen Mellinschen Satz. Arch. Math. Phys., Lpz. (3), 24, 97-107. 
Landau gives the property of the exponential function used here in the proof of Lemma 3 (i), and 


Fortet proves a theorem which includes Lemma 3 (ii). Fortet’s method is considerably more 
sophisticated than ours, but he obtains a stronger result.] 


ory of 


16-80. 


in §3. 
tique. 


), and 
more 


[ 231 ] 


A NOTE ON THE CUMULANTS OF KENDALL’S S-DISTRIBUTION 
By H. SILVERSTONE, University of Otago, Dunedin, New Zealand 


The distribution of the number of inversions of the natural order in the n! permutations of 
the first » natural numbers has been studied in connexion with the problem of rank correla- 
tion. It is well known that as n increases, this distribution tends to the normal form. 

The calculation of formulae for the fourth and higher order moments by existing methods 
is attended by some complexities. It will be shown that no such difficulties are attached to 
the cumulants, which bear a simple relation to the Bernoulli polynomials. Through the 
cumulants we are also able to study the manner in which the distribution approaches the 
normal form. 

THE CUMULANTS 
If =the number of inversions in a permutation of the first » natural numbers, there are 
certain advantages in examining the distribution of the measure S suggested by Kendall 


(1939, 1948), where 5 _ sa(n—1)—21, -—Jnln—1)<S<jn(n—}). 


| 
Kendall (1939) shows that, if all permutations are equally likely, the probability of a 
score S is given by the coefficient of ¢* in the expansion of 
f= (e441) (2 + 1+ 2) (3 +t°+t+8)..., n—1 factors. 


This is the probability generating function. We write 


n gri_tt 
fO= Nery 
For the characteristic function put ¢ = e®, to obtain 
n /(sinré 
1 (Fano) ” 
* : 
The cumulant function isthen K(0#) = Slog 5) : (2) 
oak r sind 
sin 0 © (Ba,2*—6™* 
t a ee aaah ad 
: oC) =-SARGE) cron 


where B3, =| B,, |, the numerical value of the (2k)th Bernoulli number. Hence 


H o 2k 
ie (" “) Bie eer Bi, aaron} (0<n0 <7), 





r sin 0, Ka1\ k(2k)! 
and K(@) =- > > eer lp s-tome| 
ratke1\k(2k)! . 


2k 
The (2k)th cumulant «,,, being the coefficient of (—)* aE in K(@), is given by 
2k-1 


(=F gy = By, (= 1) 
oe 
assis , [Ayal 


= k 2k Dke+r + nek — a| ; (3) 


where By,,, (v) is the (24+ 1)th Bernoulli polynomial in n. 














232 Cumulants of Kendall’s S-distribution 


(Note. By = 4, By = do, Bs = a, Bs =H. ---, 


B,(n) = n3—$n* + gn, 
By(n) = 05 — $n! + fn? —4n, 
B,(n) = n? —4n* + Jn5 — in + 3n, 
B,(n) = n° — $n8 + 6n? — 22n5 + 2n3 — n.) 
Direct substitution in (3) gives 
Ke = zgn(n—1)(2n+5), 
Ky = —ah5n(6n* + 15n? + 10n?— 31), 
Keg = raeg7(6n® + 21n5 + 21n4*— Tn? — 41), 
Kg = —gogn(10n® + 45n’ + 60n® — 42n4 + 20n? — 93). 


APPROACH TO NORMALITY 


The cumulant function, in terms of the standard deviation ,/k, as unit, is 


= — (4,07 +0,64+0,0%+...+0,0%*+...), 


the a, being positive for all k. While the Bernoulli numbers increase without limit, we find 
using the relationship 


, _ 2(2k)! 
By, = (277) €(2 k), 
(= 1K 
that a, = Rk) ra 
n ahs 
z l( sy ee) a” | - 
~ 1 \q k (n® + $n? —n)*}° 
Since > (r2#-1)< > pk < n2kt1 
r=1 r=1 


and n?+3n?—$n>n* for n>, 


a Se 1 , R 
the expression in the second bracket of (4) is less than — nea’ while for k > 2 the expression 


in the first bracket is certainly less than unity. It follows that for n> 2 all coefficients in 
K,(@), after the first, tend to zero with increasing n; that is, 


K,(0)>-3@, 
and the distribution tends to the normal form as n increases. 


MAGNITUDE OF THE ERROR 
We now proceed to find upper and lower bounds for the characteristic function 
exp {K,(0)} = exp (—}0*) exp (—a,04—a, 6° -...). 


Since 0 <a, < n-*-» for all k, we have 


exp {K,(4)} <exp{— 46}. 


sion 


ts in 


H. SILVERSTONE 233 


Again, since a, <n~*-», we have also 


Ga2si\ & 
a, 64 + a,6% + bee <5 Baa) oS an4° 
Hence exp (— 36") < exp {K,(0)} exp (5) 
and exp (- 40? - i) < exp {K,(@)} < exp (— }6*). 


Now let p(z) be the probability function which is everywhere continuous and which has 
exp {K,()} as its characteristic function; that is, let 


LT ngs ; 
Pl) = = e~ 108 eK) dO, 


Let $(z) be the normal probability function with unit variance; that is, 


(2) = a" 
=_ = oo e-t2* dé. 
Then He) ple) = 5 | ererorf— ev} a8, 
where ~ (0) = K(0)+30%, 0<9(0)<—— 


n—1° 
Since, by the mean value theorem, 
1—e™<y(0) if %(@)>0, 
17> 64 A ' 
ye Se ea Se 
|#@)-p(e)| <= |” “ewraa =, 


where A is an absolute constant. (4 = a , but is not the ‘best possible’ constant.) 


Hence, in standardized units, the error in replacing p(z) by ¢(z) is O(1/n), a result con- 
sistent with that obtained by Haden (1947). 


ERROR IN TESTS OF SIGNIFICANCE 
For the probability function p(z), the probability that z> z, is 
Plea) = | “playa, 


the corresponding probability for the normal function being 


[74 dz. 
Hence P(2) - |" pceae =|" — $(z)} dz. 
Now I 4, p(z)—G(2)}dz = | : (5 | te e-10" fe-W)— 1] ao) dz 


0 —78) 
. (e- 10% — e-it2) g10® [F-| do, 


es nid 











234 Cumulants of Kendall’s S-distribution 


by the uniform convergence of the inner integral over any finite range of z. Hence, for all 











z and z 2 
, sd 2 (24% 63 dé B 
[. @@-9ey dz|< =| ——T “aly 
That is, Pl) - {"4e dz|< ——. 
fe n—1 
CALCULATION OF P(Z) 
, 7 Ki6 
Since exp {K,(4)} = exp (—}6*)exp (5s ai 3 eit ‘ 
Ot K, 8 
— oi pki pl 
ple) = pte) +"4 a 70% +733 aie @)+- (5) 
where ¢(z) = (7) $e. 
It follows that P(z) = I * plz) dz 
-[" 4) )de— eg" (2) - 50% Z)—.- (6) 


where the second term on the right is O(i/n) and the third term is O(1/n?). 

Taking in only the first two terms on the right-hand side of (6), we may calculate the 
probability, P, of equalling or exceeding a certain score, S, (z = S/,/kg), and compare it with 
(a) the estimate, Py, based on the assumption that the distribution is normal, (b) the exact 
probability, P,, calculated from the generating function itself. Some comparisons in the 


case where n = 10 are afforded by the following table. Corrections have been made for 
continuity. 























S P Py Py 

7 0-300 0-296 0-300 
ll 0-190 0-186 0-190 
15 0-108 0-105 0-108 
17 0-078 0-076 0-078 
19 0-0543 0-0537 0-0542 
25 0-0147 0-0159 0-0143 
29 0-0049 0-0061 | 0-0046 





It will be seen that the assumption of normality is somewhat inadequate for precisely those 
values of S whose significance may be in doubt. The third derivative ¢”(z), whose value enters 
into the correction term, has numerical maxima (algebraic minima) at z = + 2-33, that is, 
at the ‘2 % points’ of the normal distribution. 

For n = 20 the probability that | S| > 58 is 0-0646 on the normal hypothesis and 0-0642 
with the correction; but the corresponding probabilities for | S | > 80 are 0-0104 and 0-0092, 
respectively. The discrepancy is, however, less marked than for, say, n = 10, where the 
corresponding probabilities for | S| >29 are 0-0122 and 0-0092, respectively. (The exact 





~ 


‘or all 


10se 
ters 
t is, 


642 
)92, 
the 
act 





H. SILVERSTONE 235 


result is 0-0098.) The conclusion to be drawn would appear to be that for moderate values 
of n (say for n between 10 and 25) the correction should be applied when S is numerically 
large to cbtain a closer estimate if required. 


I would like to express my appreciation to Professor R. M. Gabriel of the University of 
Otago for his kindness in discussing with me certain of the points of analysis involved. 


Editorial note. This paper was received about the same time as the expression (3) was given 
also by P. A. Moran at a symposium on rank correlation methods held by the Research 
Section of the Royal Statistical Society (J. R. Statist. Soc. Ser. B, 12, Part 2, at Press). 
Dr Silverstone has, however, developed the consequences of the formula more fully than 
Dr Moran, whose paper was mainly expository. 

REFERENCES 


Hapen, H. G. (1947). A note on the distribution of the different orderings of n objects. Proc. Camb. 


Phil. Soc. 43, 1. 
KENDALL, M. G. (1939). The Advanced Theory of Statistics, 1. Charles Griffin and Co. 
KENDALL, M. G. (1948). Rank Correlation Methods. Charles Griffin and Co. 











[ 236 ] 


THE DISTRIBUTION OF THE VARIANCE RATIO IN RANDOM 
SAMPLES OF ANY SIZE DRAWN FROM NON-NORMAL UNIVERSES 


By A. K. GAYEN 
St Catharine’s College, University of Cambridge 


1. INTRODUCTION 


With some theoretical results and extensive sampling experiments from populations of 
known form, E. S. Pearson (1931) has studied the effect of non-normality on the frequency 
distribution of the variance ratio. In the case of a one-way classification for analysis of 
variance, he has shown that ‘Between-groups’ and ‘Within-groups’ mean squares still 
continue to provide unbiased estimates of the population variance, but they are no longer 
independently distributed. However, in view of the fact that the expressions for the first 
two moments of their ratio (denoted here by w) are, up to certain approximations, indepen- 
dent of the population /’s, he has inferred that the normal-theory test will not be seriously 
invalidated, provided the total number of samples is not too small. But in the more general 
problem, where two essentially different estimates of variance are compared, he has pointed 
out that the distribution of their ratio (denoted here by v) will be considerably more sensitive 
to changes in the population form. The sampling investigation of T. Eden & F. Yates (1933) 
was not of the same kind as that of E. S. Pearson; it was carried out with the object, 
as M. G. Kendall (1946) has stated, of confirming the z-test (z = }log,w) for data under 
randomization. The experimental material considered by the authors exhibits a decided 
skewness, as measured by A, (= ./f,); but the other measure of deviation, namely, the 
kurtosis A, (= £,—3), the effect of which on w is rather more serious, has not been referred 
to in the course of their work. R. C. Geary (1947) has derived an approximate formula for 
the probability correction of w, of which a suggestion has been made for tabulation. He has 
also furnished asymptotic expressions for the first two moments of z (= }log,v) in samples 
from any population and discussed some methods for their use in the evaluation of the 
approximate true probability. His formulae, in both cases, are based on the large sample 
assumption and consider the effects of kurtosis only. 

In the present paper the problem is studied theoretically in some detail by deriving the 
mathematical forms of the distributions of both the test functions w and v for populations 
characterized by the a priori values of the universal A’s and expressed by the first four terms 
(up to A§) of the Edgeworth series. In addition to the normal-theory function, the frequency 
density in each case furnishes corrective terms in A, and A3. The first two moments calculated 
Girectly from the derived functions agree up to certain approximations with the results 
(obtained otherwise by various workers) which are known to be true of any universe. Thus, 
starting from the Edgeworth series, it seems possible to reach results which in fairly large 
samples may closely approximate the actual distribution of the variance ratio for any form 
of population. 

Formulae for the tail area, derived and tabulated in the text, enable us to examine the 
true probability of the variance ratio in any size of samples for a priori values of A, and A,, 
provided the populations agree well with the first four terms of the Edgeworth series. The 
same exp-ossions remain valid asymptotically for any universe with finite cumulants, so 





Co 
th 


SES 








A. K. GAYEN 237 


that it is not unlikely that for moderate sizes of samples they have quite an extended range 
of applicability. Where the exact knowledge of the A’s is not available, we may sometimes 
safeguard against error by considering the corrections for their plausible values. A suggestion 
for the use of R. A. Fisher’s k statistics in the estimation of A’s from the sample values has 
been made by R. C. Geary in his 1947 paper. . 


2. DISTRIBUTION OF THE VARIANCE RATIO IN k SAMPLES 
(2:1) Frequency density of ‘ Between growps’ and ‘Within groups’ sum of squares 


Consider sets of k samples of size n,, 9, ...,%;, ...,%, drawn from a population expressed by 
the Edgeworth series As rv 2 
fla) = $(z)— Boa) + 19a) + 39), (2-1) 


where ¢(x) is the standardized normal function, (x) its rth derivative, and A; (= ¥/f,), 
A, (=£2—8) are respectively the measures of skewness and excess of the universe. Let us 
write for the ‘Between-groups’ and ‘Within-groups’ sum of squares 


k k wy 
x = ZB inslt; —z)} and Y = 2 (is %;)", (2-2) 
where, as usual, 2,; is the ith variable of the jth group, %; its mean and 7 the mean of all 


k 
N = Dn, variables. If we put 
1 


nj "i nj ; 7 Si; nj a 
iy = x Xij> So; = >» Vij, S3; > So;— —_—_=> a (4; — %;) a (2-3) 
i=1 i=l n; i=1 

k . k /§2 
and Xx, = > (S,;), X, =p (=), (2:4) 

1 1 n; 

xX? k 

then it will be seen that X = X,- W and Y= ~ (S,,). (2-5) 


By the use of the author’s (Gayen, 1949) result (2-11) (which gives the distribution 
9(S13, Se3) of S,; and S,;), the joint characteristic function of X,, X, and Y can be obtained as 


k co ro) 
__ exp{—N#/2(1— 2it,)} 
~ (1— 2it,)# (1 — 2iuyt—-®) 


Ay 6Nk 3k’? 
+aN N ats {(3 Hy (T42) + (1 pe (1 — 2it,) Ai, (72) 2 (1 — 2it,)? :) 





N-k 3k 
1443 at i ~ wi (712) + NA, (12) + oi) — Se >) H,(7 ne} 








j= ip (Nk—k’?)\ | 3(N?—2Nk+k?) 
+7) (y (N—k) (rs) + Gay me 
_ {(u H,(72) + 3N*(N — 1) {3.H,(712) + 6Hy(7 12) + 2}] 
+ aaa [N2(2k + 3) Ay(742) + 6N{(Nk + 2N — 3k) H,(7,2) + (N —&)}] 
9 : ‘15k” 
+a 3m, Tapa yelV Mh + 4) Halse) + Nh( e+ 2)— 3k] +7 ia) 











238 Variance ratio in random samples from non-normal universes 


*G- == (are k) H,(142) + 3N(N — 3) (N —k) H,(7,2)— 3N(N — k)] 
: 4 , (Nk—k’2) 
+ (1 — 2i#,) [N(N —k) (k+ 1) Hy (742)+{N(N —k) k-—3NK + 3k'}]+3 a a) 





9 
+{N%N —1)—2Nk(N —2) + NA2— 3k] + oe ; = -*) 
6 , 
+ 72a (N?—3Nk+ 2k »}. (2-6) 
where " Typ = tt,/(1— 2%t,), (2-7) 
k= NX (1/ns) (= k*, if n,’s are equal), (28) 


and H,(x) is the well-known Hermitian polynomial of vth degree in x. 
For the evaluation of the integrals occurring in (2-6) which reduce to the form 


exp { —n,t7/2(1 — 2it - t it 
Jen) eel: ne ‘i = a, i) erm (sen — 2) * A 4] dil ate 
where t = s/[(1— 2%it,)/n,] {S,; —it,n,/(1 — 2it,)} (210) 
(v taking the values 0, 1, 2,3, 4and 6;7,, —2, —1, 0and 1; and rg, 0, 1, 2 and 3), the following 
results were found useful: 


© AES) m6)+()amale)+()@) aE). em 
(5) (5) [a)-"a 1) 2(5) 

pene. aa (5, 1) Be (5)--..]. (2-12) 

i J7_%(5)*(e) 46) - Sn (a) (Tar) (219) 

(iv) | * Hwn(=) o() a(=) im 0, (2:14) 


To derive the distribution of X,, X, and Y by Fourier’s inversion theorem we have to 
encounter some of the similar types of integrals (2-9), (2-13) and (2-14). On performing the 
integrations and writing £1 ¢-H€ 

y(v; £) = PT)’ (2-15) 


we obtain the joint frequency density in the form 
f% Xv ¥) = (TR) | rk-1,HyeV-E, Y) 




















+33((way (5 )rte- 1, X)+3(k—1)H, (4) 1+ 1,x)) 10-2, Y) 
+3(N— b) Ha) eI, X)y(N—k+2, Y)| 
+ ay | (97%(59) y(k—1,X)+6N(k- 1) A,(53) y(k-+1, X) 


+ 3(k’2—2k+ k+3,X)) Nk, Y) 





- k’2) 


7 k’?) 
i) 
(2-6) 


(2-7) 
(2-8) 


(2-9) 


2-10) 
wing 


2-11) 


2-12) 
2-13) 


+14) 


e to 
the 


-15) 


_Y) 








A. K. GAvEN 239 


+0(1vUv - ky BA 3) 7 (e—1,X)+(Nk—k®—N +h) y(k +1, x))y (N—k+2,Y) 
+3(N?—2Nk +k’) y(k—1,X)y(N—k+4, ¥)} 


+h (Drag) see —ofon( 9) an lama 


1 
W 
+6f 4 1) (3) +3 3) (k—-1) (5) - sN(k—1) |y(k+1,X) 
+9 (k * 1) H,( 5!) +(e 1)—3(k'2— 2k+1) |y(k-+3, x) 


+ 3[5k'2— 3k? 6+ 4] (k-+5, x) y(N—k, Y) 
+ a([ wa k) #(3 ) +3N(N—3)(N— my H,(5}) _3N(N— | y(ke—1,X) 
+ af rae 1) — N (2+ 2k—3) +3(k—b)} + N(N—b)(k- 1) (2) | y(k-+1,X) 
+3(2—k) y(k+3, x) y(N—k+2, ¥) 
+ o([ avs ~ 2N%(k—1) + Nk(k—2)} #453) + {N3— N(2k-+ 1) + Nk(k+4)— 3k 


x y(e— 1, X) +{2N (k— 1) — 2k’? — k(k—2)} (e+ 1, X)) yV—k+ 4, Y) 
+ 6(evv ~ 3k) + 2k} y(k—1, x) y(N'-k+6, y)}| (2-16) 


where ¢(X,/./N) is the normal function in (X,/,/N), and H,(X,/N) is the usual Hermitian 
polynomial in (X,/N) of vth degree. 

As a check, the mean of f(X,, X,, Y) over X, and Y was found to agree with the standard 
formula for the distribution of the sum of N samples from the Edgeworth population. Now 
integrating for X,, the required joint frequency density of X and Y is obtained as 


f(-,X, Y) = y(k-1,X) y(N —-k, Y) 
+ At (((e—2b-+ 1) y(k+3, X)—2(N —1) (k—-1) y(k+1, X) 
+(N—1)y(k—-1, X)] y(N —&, ¥) 
+ 2[{N(k—1)—k’2+ kb} y(k+1, X)—(N—1)(N—k) y(k—-1, X)]y(N-k+2, ¥) 
+[{N(N — 2k) +k} y(k—1, X)]y(N —k+4, Y)} 


+8 {[(5k’? — 3k(k + 2) + 4) y(k+ 5, X) — 3(3k"* —k(k + 6) +4) y(k+3, X) 


+ 34N 
+6(N —2) (k—1) y(k+1, X)—2(N —1)(N—2)y(k—1, X)]7(N—&, Y) 
+ 6[(k2—k’2) p(k +3, X) —(2N(k—1)— 3k’? + k(k + 2)) y(k-+ 1, X) 
 4(N—k)(N—2)y(k—-1, X)] y(N —k+2, Y) 
+3[(2NV(k— 1) —k’?—k(k—2)) y(k +1, X) —(2N(N—1)4 3k? 
—k(4N —k+2))y(k—1, X)] y(N —k+4, Y) 
+2[(N?—3Nk + 2k) y(k—1, X)]y(N—k+ 6, ¥)}, (2-17) 














240 Variance ratio in random samples from non-normal universes 


the term in A, not appearing as it involved odd functions in X,. For the frequency densities 
of X and Y we have 


f°, X,°) = y(k-1, X)+ At wt ak+ 1) {y(k + 3, X)—2y(k+1, X)+y(k—1, X)} 


+8 (oe 3k? — 6k + 4) {y(k+5, X)—3y(k+3, X) + 3y(k+ 1, X)—y(k—1, X)} 


(2-18) 
and 


Rene ¥) + At (wt 2Nk+k){y(N—+4,¥)—2y(N—k+2, ¥)+y(V—k,Y)} 


+a AS_(2_3Nvk-+2k'2) {y(N —k+6, ¥Y)—3y(N—k+4, Y) 


+3y(N —k+2, Y)—y(N—k, Y)}, (2-19) 
which are, as observed by E. S. Pearson (1931, p. 119, footnote), dependent on n,’s. The above 
expressions for the frequencies of X and Y are analogous in form to the Romanovsky 


Generalized Type III Pearson curve. The formulae for the first two moments and covariance 
of X and Y, viz. 


Mean X = k-1, Mean Y = N —k, 


=| ar 12 —_ Ay 2 12 
Var X = 2(k—1) +77 (k 2k +1), Var ¥ = (N—k)+>(N —2Nk+k ), (2-20) 


Cov (X, Y) = MM vk-N+k-k), 


calculated directly from the above distributions, are found to agree with Pearson’s results 
(7)-(11), which he deduced by a different method. 


(2-2) Frequency density of the ratio of two estimates of variance 
and the probability corrections for excess and skewness 


Let w = 2X 


be the ratio of two estimates of variance with vy, = k—1 and vy, = N —k degrees of freedom. 


Then the typical product function y(/, X)y(m, Y) occurring in formula (2-17) yields for 
the frequency distribution of w the expression 


ac ee 
Ail, 4m; w) 7 Bi4l, 4m) (Vg+ py, w) Km)’ 
which, as may be noted, gives the normal-theory frequency of w for / = v, and m = 1. 
Using (2-21) in (2-17) we obtain the required frequency density of w in the form 


p(w) = B(dr4, 4%; w) — = rnd (A(2F*, 2 7, w) - 2p(5*, “a, wo) + A(z! “+4: wl 





(2-21) 























2 2” 2 
+ 8{ 00) (AE, 2; w) — 30000) (AES, “22, w) 
+3) A(2E=, BE wo) — ada, AE w)] 
= Polvo) —AePr,(w) + Aba), (2-22) 


{2v, V+ (k?—k’?) (v, + v, + 2)} 
8(V, +Vgt+1) (V+ r—_+2) ’ 





where (Veg) = (2-23) 





an 





Asities 


1, X)} 
(2-18) 


k,Y)} 


(2-19) 
above 
ovsky 
‘lance 


(2-20) 


sults 


2-23) 





A. K. GAYEN 241 

















and 
(vq,) = _ {2», Vly (Vy + Ve + 7) — (Vg— 22)] + (hk? — k’?) (vy + Vg + 2) (40, — 54+ 16)} 
24(v1 + V+ 1) (vy +¥_+ 2) (Vy +24 4) 
(Ves) = {21 ve[1(Yy + Vg + 7) — (V_— 10)] + (k? — k’?) (vy + Vg + 2) (40, — SVQ + i 
™ 24(vy + Vo+1) (vy ++ 2) (Vy +¥_+4) 2-94 
(Vgg) = {2v, ve[4(Yy + Vo + 7) — (Vg + 2)] + (k?— k’?) (vy + Vg 4 2) oe * , 
™ 24(V, + Vq+ 1) (Vy + ¥_+ 2) (¥y + 24+ 4) 
(94) = {2v, volvy(Vy + Vg +7) — (Vet 14)] + (k?— ke’) (vy + V9 + 2) (40, — 5¥, — 20)} 
"i 2414, + ¥g+ 1) (vy +¥_+ 2) (Vy +24 4) 
Note that (Vg1) — 3(Vgq) + 3(¥33) — (Vag) = 9, (2-25) 


as expected, since I, p(w) dw -| p(w) dw = 1. The (v,,) coefficients may be put into the 
0 


form 


(Ys) = (Ypg)’ + (AK*) (¥,0)”; (2-26) 
where Ak? = k?—k’? = k2- N 3 (1/n), (2-27) 


an expression which is zero when n,’s are equal. By direct calculation from the frequency 
function p(w) of (2-22), we obtain 


Vey 2v, Vo + (AK?) (v, + V2 + 2) 2 2{2v, vo(v, + 5) + (Ak?) (4v, + v2 + 4)} 




















Me = _ 

eee yg — 2 “4 YA(Vq—2) (Vg 2) (Vy Vat 1)? M4(¥g— 2) (Pg + 2) (Vg +4) (Vy + M2 + 1) 
1 e+ 2) , 2(ke+ 4k +4) 1 (2(k—1)+(Ak?) | k(k—1) + (Ak?) (2k +1) 
Rats xi ie -a~y aiid Ns 

2 2(k—1)(k+4)+ (Ak 
+E) Ns ', (228) 
and 
Veris a 2v3(Y2+¥; — 2) a 2a" Vy + (AK?) (vy + Vp + 2)} {Py (Yq + 2) + (¥2— 2) (¥2 + 8)} 


V(P_— 2)? (Vp — 4) 
8yp{v; Ve[r3(Y; + 2) + Ya(P, + 3) (v1 + 8) — 2(2r, + 7)] 
ar — (Ak?) [v3(v, —v, + 1) — vg(23 + 11y, + 10) + 4(5y, + 2)}} 
, Vi(¥_— 2)? (vg + 2) (Ya — 4) (Vg + 4) (Vg +4 + 1) 


Vi(Vg— 2)? (Vg + 2) (Vg— 4) (¥y + P+ 1) 











ih (k+5)  (k2+13k +20) 
-—-5l! ss  - 
Gh pena 2(k? + 10k — 6) + 2(AK*) (& + 6)) 
(k—1) N WN? | 
ag 8 (E+ R=1)- (AM) (9,99) 





3(k—1)? N? 


R. C. Geary’s (1947) asymptotic formulae (up to N-') for moments of w for any universe 
(his formula (2-11)), deduced otherwise from E. 8. Pearson’s (1931) equations (9)-(11), are 
found to be in agreement with the first few terms of our results (2-28) and (2-29). Accordingly, 
it appears that although our expressions are valid strictly for the Edgeworth form of 
population, they may hold good approximately for any universe when the samples are 
sufficiently large. The formula for Varw, while confirming Geary’s result (up to N-'), 


Biometrika 37 16 








242 Variance ratio in random samples from non-normal universes 


appears to contradict E. S. Pearson’s (1931) conjecture made on the results of his experi- 

mental sampling that ‘for the leptokurtic populations, ¢,: (the s.D. of correlation-ratio) is 

probably significantly too large, and for the skew populations significantly too small’. 
Integrating (2-22) from w, (a typical value of w) to 00, we obtain 


























Pw) = [” p(w) due = P00) + Ag, Wo) + ABP), (2-30) 
where Fo (wo) = 1,,(4¥2, 4%); (2-31) 
the normal-theory tail probability, 
Pala) = (an) {lg(3B, 25) — 27,,(85, 25) 4 7,,(5*, 2)! (2-32) 
“ 5 |e Vy 1 oa (Ve+v,+4) saad at ' 2 
= 20m) 2b — 20 (oy Gr bay eee al [B(5 ), (2s2bio 
Pyx(wo) = 
6 
(0m) La("2, 2F) — a0 fa(2E*, 2E*) + 94050) Ze(“AE*, AE") — 5) 1,("2F*, 3] 
(2:33) 








Ze (1 — 2.01 (¥31) (¥y— 23) (Yai) _ 9 (Yse) 

iia et Perea terices, os 
V4(Vy + 2) (Vy + 4) + Vqi¥_ + 2) (+4), y— 3 MaMa) 4 + Mn+ 2) Ag+ 2) |, ) 
(vy + 4) (Vg + 2) (Vg + 4) (Vy +¥2+4). ™ (Vg + 2) (Vet 4) (Vy + ¥_+ 4) ” 


the expression J,,(s,r) being the incomplete B-function ratio, as defined by Karl Pearson, 
with 2) = 1 Wy/(V2+V,W). The alternative expressions (2-32 bis) and (2-33 bis) were reached 
by the application of H. E. Soper’s (1921, pp. 19-20) formulae for effecting the index changes 
of I,(s,r) function. 

The integrals P,,(wy) and P,:(wy) would give the probability corrections at w, for the 
population kurtosis and skewness respectively. Using the form of expressions (2-26) for the 
(¥,,) coefficients, if we write P (w,) = P4(w») + (Ak*) Px(wy), (2-34) 
then formula (2-30) for the approximate true probability at w, can be finally put into the form 

P(wo) = Po(wo) + A{Pa,(Wo) + (AK*)P5,(wo)} + ASP az(wo) + (AK*) Phx(wo)}- (2°35) 
Note that the P}(w,)’s might give the measures of corrections needed for Ak?, defined in 
(2-27), which depends, as we know, on the differences in sample size. 

For values of vy, = 1(1)6, 8, 12, 24 and oo, v, = 1(1)6, 8, 12, 20, 24, 30, 40, 60, 120 and 00, 

the functions P,(w,)’s of (2-35) have been tabulated in all cases at values of wo, given by 














P,(wo) -{ Po(w) dw = 0-05, i.e. at the 5% points of the normal-theory w (see Table 4, 
Ww 


pp. 252-5 below). Catherine M. Thompson’s (1941) five-figure values of the transformed 
variate x) have been used. In most cases, as a convenient check to the tabulation work, 
both the forms of (2-32) and (2-33) have been utilized. The alternative forms, (2-32 bis) and 
(2-33 bis), as they avoid the J,(s,r) functions, are more suitable for computational work in 
cases of high or low values of v, and rv. 

The probability corrections, which are found to be rather small, diminish rapidly in 
magnitude with the increasing sample size (even with as low a value as 12 for v, in most cases). 


o! 


) 


) 


) 


A. K. GayEn 243 


The tables for the P;(w,)’s indicate that the effect of small differences in the sizes of groups 
as measured by Ak? will not be very considerable. Although a comparatively smaller effect 
of skewness, A3, than that of kurtosis, ,, on the variance ratio w is indicated by formulae 
































Pi, (wo) 


Fig. 1. Showing the comparative values of P,(w») and P3(w»), at the 5% points of the 
normal-theory w, for certain values of vy, and V3. 


— Py 009) ~~ ~~ Pil) 


Piz (wo) | 


0-010 


























Px, (wo) 


Fig. 2. Showing the comparative values of P)(w») and P43(w») at the 5% points of the 
normal-theory w, for certain values of v, and V2. 


Pi (wo); -—-— P4x(w»).- 





16-2 








244 


(2-28) and (2-29), at the upper 5% points the corrections P}:(w)) are found to be some- 
what larger in magnitude than P},(w)). The relative effects of skewness and excess on 
the tail probabilities of w for a few representative cases will be seen in the diagrams 
(Figs. 1 and 2) on p. 243. 


Variance ratio in random samples from non-normal universes 


(2:3) Asymptotic character of the frequency density of w 

The expression (2-22) approximates fairly closely to the actual distribution of w in samples 
drawn from any universe with finite cumulants, provided the samples are so large that terms 
in N-* can be neglected. On the other hand, it gives the frequency density of w for samples 
of any size drawn from the Edgeworth series (2-1) when the terms in A’s other than Ag, A, 
and Aj are negligible. R. C. Geary’s (1947) asymptotic formulae for the first two moments 
of w for any population (his result (2-11)) indicate that higher order cumulants can oniy 
occur in terms in N~* (or in higher negative powers of NV). Thus it is not unlikely that for 
samples of moderate size, i.e. when the total number of observations WN is not too small, 
the frequency function p(w) has quite an extended range of applicability, provided higher 
order cumulants other than those considered are negligibly small. Note that the odd powers 
of odd-order cumulants have no effect on the distribution of the variance ratio. 

The true probability P(w,), (2:30), depends on the population values of A} and A,. When 
the exact knowledge of these cumulants is not available, we may sometimes safeguard against 
error by considering the corrections for their plausible values. A suggestion for the use of 
the corresponding k statistics of R. A. Fisher in the estimation of A’s from the sample values 
has been made by R. C. Geary in his 1947 paper. 

Table 1 shows for the case of five samples of five observations and for a fairly wide range 
of Aj and A,, the values of P(w ) at wy = 2-8661, the normal-theory 5 % point. It is obviously 
unwise to assume that the tabulated results necessarily furnish in all cases of large values 
of Aj and A, satisfactory estimates of the true probabilities of w for such small samples. 

These results show that for the Edgeworth form of leptokurtic symmetrical populations, 
the P(w,)’s are always less than their normal-theory values, P,(w,)’s. That this is also true 
for large samples from any universe is indicated by R. C. Geary’s (1947) formula for the 
probability correction (2-16). But the fact appears to contradict E. 8S. Pearson’s (1931) 


Table 1. Showing the comparative values of P(w,) = [° p(w) dw, at w, = 2-8661, 


Wo 


the normal-theory 5 % point, for v, = 4. and v, = 20 degrees of freedom 






































aS 
0-00 0-25 0-50 0-75 1-00 1-50 2-00 
Ay 

—1+5 0-0536 0-0538 0-0541 

—1-0 0-0524 0-0526 0-0529 0-0531 0-0534 

—0°5 0-0512 0-0514 0-0517 0-0519 0:0522 0-0527 
0-0 0-0500 0-0502 0-0505 0-0507 0-0510 0-0515 0-0519 
0-5 0-0488 0-0490 0-0493 0-0495 0-0498 0-0503 0:0507 | 
1-0 0-0476 0-0478 0-0481 0-0483 0-0486 0-0491 00495 | 
1-5 0-0464 0-0466 0-0469 0-0471 0-0474 0-0479 0-0483 | 
2-0 0-0452 0-0454 0-0457 0-0459 0-0462 0-0467 0-:0471 | 
2-5 0-0440 0-0442 0-0445 0-0447 0:0450 0-0455 0-0459 | 








ome- 
Ss on 
rams 


iples 
erms 
1ples 
ign Ag 
lents 
only 
t for 
mall, 
gher 


wers 


Vhen 
ainst 
se of 
alues 


ange 
yusly 
alues 


ions, 
true 
r the 
931) 





¢ © 


Ad 
—] 





A. K. Gayen 245 


experimental results shown in his Tables V (a)-V (c). He considered the distribution of the 
squared correlation ratio HZ? (which he called 4”), where H? = X/(X + Y) and found that the 
experimental tail frequencies for leptokurtic populations were generally greater than those 
of the normal-theory case. Table 2 shows results for two of his symmetrical leptokurtic 
distributions, having A,(= £,—3) equal to 1-1 and 4-1 respectively. In this table col. (6) 
shows MP,(E*), the frequency of sample HZ? expected on normal theory to lie beyond the 
values of H? given in col. (5), while col. (7) shows the frequency F actually found beyond 
these limits in Pearson’s sampling experiments. The limits chosen are as near as possible to 
the normal-theory 5% points. Col. (8), caleulated from the sum of the tail terms of the 
binomial {1 — P,(#?) + P,(Z?)}“, gives the probability, say 7(F), that an experimental fre- 
quency will be as great or greater than the observed F if the normal-theory law is appro- 
priate. An overall test of significance is provided by calculating —2log,7(F) (col. (9)), 
summing and referring this sum to the x? distribution with 20 degrees of freedom. The 
resulting value of 32-29 is just beyond the 5 % level (at 31-41). Thus there is some evidence, 
though it is not very strong, that the experimental tail frequencies were in excess of 
expectation. 


Table 2. Comparison of Pearson’s sampling results with normal theory 





















































Size of samples Experi- 

Parental , anh For Normal-theory mental Probabilit 

excess ineebeitiendil values expectation observed n(F) Yi_2 log,7(f)| D.F 
A of E? MP,(£?) frequency 

N k F 
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) 
11 40 2 50 > 0-66 3-0 (0-060) 4 0:3528 2-08 2 
1-1 40 2 50 < 0°34 4-7 (0-094) 9 0-0503 5-98 2 
41 40 2 50 > 0-66 3-0 (0-060) 6 0-0839 4-96 2 
41 40 2 50 <0°34 4-7 (0-094) 4 0-6903 0-74 2 
| 
1-1 50 5 100 > 0-32 5-2 (0-052) 7 0-2676 2-64 2 
1-1 50 5 100 <0-08 6-5 (0-065) 12 0-0339 6:77 2 
41 50 5 100 > 0-32 5-2 (0-052) 5 0-5939 1-04 2 
41 50 5 100 <0-08 6-5 (0-065) 9 0-2084 3-14 2 
41 25 5 200 > 0°36 10-6 (0-053) 12 0-3731 1-97 2 
41 25 5 200 | <0-04 13-8 (0-069) 17 0-2270 2-97 2 
| 
Sum 32-29 20 


Example 1. W. A. Shewhart’s data (being records of 200 tests for insulation resistance in 
megohms), as studied by Pearson (1931, p. 115), is significantly non-normal, having 
A2 = 0-578, A, = 1-317. When the data are divided into twenty samples of ten observations 
and a test for the significance of differences between the means is carried out, the value of 
z is found to be z = 0-577 for v, = 19 and v, = 180 degrees of freedom. From Table 4, by 
rough interpolation, we find P},(w») = —0-0003, and P);(w») = 0-0003. Considering the 
above values of the A’s as fair estimates of the population values, we have from (2-30), 
P(w,) = 0-0498, showing that the standard table 5 % point is very nearly also the 5% point 
for the present population. The normal-theory test is accordingly valid for this case. 














246 Variance ratio in random samples from non-normal universes 


The formulae and the tables for the probability corrections indicate that the presence of 
decided skewness in leptokurtic populations may sometimes contribute to the stability of | 
the normal-theory law of w, especially when the samples are not large enough to make the 
effects of the term in Aj, as usual, less important than that in A,. The situation is illustrated 
in the following example. 


Example 2 (S. J. Pretorious, 1930). Measurements of length were made on a large popula- 

tion (9440) of a certain species of beans; for their frequency distribution Aj = 0-8291, 
A, = 1-8629. Would it be justifiable to consider that there is no significant difference between 
the means of the following samples of 4, 5 and 6 beans assumed to be randomly drawn from 
populations having a common standard deviation and described by the Edgeworth series 
with these A’s? 
Bean lengths: (1) 13-5, 14-0, 16-0, 16-5. 
(2) 12:5, 135, 14:5, 15-0, 16-5. 
(3) 11-0, 11-5, 12-0, 13-5, 14-5, 15-5. 

Here w = 2-0442 for vy, = 2, v, = 12 degrees of freedom. Further, from (2-27), Ak? = — 0-25. 
The probability at w) = 3-88 (the normal-theory 5% point), as calculated from formula , 
(2-35) and Table 4, is 

P(wy) = 0-0500 + (1-8629) {( — 00030) + (— 0-25) (—0-0010)} 
+ (08291) {(0-0013) + (— 0-25) (0-0007)} 
= 00458, 
so that wy, is still approximately the 5% point for the above population. As the value of 
w f=-"ls well below it, there is no reason to suspect the common source of the samples. 


3. DiIsTRIBUTION OF THE VARIANCE RATIO IN TWO INDEPENDENT SAMPLES 


(3-1) Frequency function of the ratio of variances in samples from two different 
populations and the probability corrections for kurtosis and skewness 


If 2}, %3,...,%, and 24,29,...,2%,- are two samples drawn respectively from two different 
populations, (A3,A4) and (Aj, Aq), of the form (2-1), having zero mean and the same unit 
variance, then, proceeding exactly as before, the frequency density of 


_. (eA) © 4 =" 4 nome _ MoS 
at ~ pe PS ei-2) 2 (e z") “-9 
will be found to be 


p(v) = B(4m, 4g; v) 
+(NaA(™Z*, Bs »)—2mga(™Z, E> 0) + Naa, *E4, o)| 


+6 n N+4 n.+2 
— [Na A(™E*, B: v) —3ma(™F*, 2: 0) 


+3Nah(™S=, rf, 0) -MuA(3 are; o)| (3°1) 


= Po(v) + (Ag + Ag) Dag(v) + H(Ag — AQ) Pag(v) — HAS? + Ag?) Paa(v) — (As? — A3*) Payl), 
(3-1 bis) 




















ce of 
ty of 
e the 
rated 


pula- 
3291, 
ween 
from 
eries 


0-25. 
mula 


1e of 


rent 
unit 


3-1) 


bis) 


A. K. GAYEN 247 


























where 
eee [H(Ag + Ag) {my M9(m, + Mg + 2) — 2(my — Mq)} — H(AQ— Ag) {1ry a(t — My + 4) + 2(mry + M9)}) 
a wi 8(m, + 1) (M_+ 1) (my + Mg) (My +Nq+ 2) 
“a [HAG + AG) {70 M9(my + My + 2)} — HAG — Aj) {01 9(% — No)}] 
ee NA 8(m, + 1) (My + 1) (ny + Mg) (Ny + Ng + 2) , : 
a ae [E(Ag + Ag) {my Mo(M1 + My t+ 2) + 2(ny — Ng)} — $(Ag — Ag) {1 Mo(M — My — 4) — 2(m + Ng)}] 
—_ = 8(n, + 1) (nm. + 1) (ny + Mg) (ny + y+ 2) : 
(3-2) 
and 
[$(Ag? +Ag?) {(my + 1) (my + 2) (my + 4) (My— 1) — (my — 1) (Mg + 1) (1g — 2) (Mg — 4)} 
A te carn + $(Ag? — Ag®) {2(my Mg — 1) (my + 2) (my + 4) + (My + Mg) (my — 1) (My + 1) (Mg — 1, — 6)}] 
— er 12(n + 1) (mq + 1) (m+ Mg) (Ny + Mg + 2) (Ny + Ng + 4) ’ 
[$(Ag? + Ag?) {04(my + 1) (my + 2) (mg — 1) —Nq(Mq+ 1) (Mg — 2) (ny — 1)} 
Nee = 2% ———_—_- — + 4(Ag? — Ag?) {m4 (My + 1) (4 + 2) (Mg— 1) + Nq(My + 1) (Mg — 2) (M1 — 1} 
- bi 12(n, + 1) (mg + 1) (my + Ng) (My + Mg + 2) (Ny + Nq+ 4) 
[3(Ag? + Ag?) {23(704 + 1) (my — 2) (2g — 1) — Mq(mq + 1) (mq + 2) (m — 1) 
erties + $(Ag? —Ag*) {my (m1 + 1) (m4 — 2) (7%q— 1) + Ma(Mg + 1) (mg + 2) (my — 13) 
— 12(n, + 1) (m+ 1) (ny + Mg) (My + Mg + 2) (My + q+ 4) 
[4(Ag2 + Ag?) {(m, + 1) (my — 2) (my — 4) (mg — 1) — (my — 1) (2g + 1) (Mg + 2) (q+ 4)} 
Bb 4 i + $(Ag? — Ag?) {2(m Mg — 1) (Mg + 2) (My + 4) + (my + Ng) (my +1) (m_— 1) (ny — M2 — 6)}] 
ou Not 12(ny + 1) (mg + 1) (My + Mg) (My + Mgt 2) (Ny + M2 + 4) 4 





(¥*3) 

For deriving the joint distribution of the two estimates of variance, which are essentially 
independent in the present case, the author’s (Gayen, 1949) result (3-1) was used. The 
alternative forms of the V-coefficients, viz. 

Nyy = 4AG+Ag) Nan + H(Ag—Ag) No, and Nyy = $(Ag? + Ag?) Nag + 3(A3?— As?) Nig, (3-4) 
enable us to write the frequency function p(v) in the form given in (3-1 bis). As before, note 
_ Nyy —2Nog+Nog = 0 and Ny, —3Noq+3Nog— Nog = 0, (3:5) 
which serve as useful checks to our calculations. 

From the frequency function (3-1), by direct integration, we get 
4no(m2—1) 


Da No ” ne — ees 4) 
meen sa Gat eae As (mg — 2) (mg + 1) (my + 2) (ng + 4) 


2 4 8 Ye oe | no{ 4 24 
alte + att h(i aat +3)- rv(s - a) (3:6) 

















NN, Ng Ng 2 
2n3(n. +n, — 2) : ne 
Vary = — 32 1 ry 
Ny (Ng — 2)” (Ng — 4) *(n 2— 2) (nm, — 4) (ny +1) 
bel. ng{N9(n + 6) + 2(m — 6)} eh + 83 (My — 1) {n9(m, + 4) — 8} 
* (mq — 2)? (mq + 1) (mg + 2) (Mg — 4) > ny (mq— 2)? (mM — 4) (mg + 1) (Mg + 2) (Mg + 4) 
a2 4 +8), ys (2 ™ 1) .a%(! aif): amath, an 
ile Ny Ng Ny Ng 1 Ny N3 n,n3 











248 Variance ratio in random samples from non-normal universes 


The asymptotic expressions in (3-6) and (3-7) were found to check with results obtained 
otherwise by using E. 8. Pearson’s (1931) formulae (17) and (18). The tail area 
Pv) = |" plo) do = Fev) + HAL +AY) Pre) 
+ 4(Ag — Ag) Pag(Vo) — HAs? + As?) Paq(%) — $(Ag?— Ag?) Pax(%),  (3°8) 


at any typical value vp, involves expressions similar to those occurring in (2-30). The integrals 
P5,(Up), Py,(vo) are now of the form 


Not4 n Not+2 nN, +2 No N,+4 
P,,(%) = [ Mate, 3) - 2k , ? . )+ Mala "2. - )| (3-9) 

















(Ma _( Ma, Me), 
= Qarha(1 —2,)im at?) Mr +2) (mg 27) (3-9 bis) 
B (mz? Ng + *) 
oT ey 

where 2% = %2Vp9/("1 + %_U%p). The precise expressions for the P,(vp)’s of (3-8) can be easily 
written down after replacing v,, v, by n,, n, and the (v) coefficients by the corresponding 
N coefficients and then putting proper bars on the A’s and appropriate dashes on N coeffi- 
cients in the formulae (2-33), (2-33 bis) and (3-9), (3-9 bis). The functions P,(vp)’s have been 
tabulated (Table 4), as before, at: the 5° points of the normal-theory v, for the same repre- 
sentative values of the degrees of freedom. For certain values of n, and n,, the comparative 
effects of these corrective factors will be seen in Figs. 3 and 4. 

The probability corrections for kurtosis are large but those for skewness rather small. 
Note that in most cases of large samples the P,(vp) attain their maximum values. The formula 
for the tail probability of, say, ¢’ = /n, (the variance ratio for n, degrees of freedom in the 
case 2, = 00) at any typical point ¢4, is obtained from the author’s.(Gayen, 1949) result 
ae P(Eo) = Fa(Eo) + Ag Pyle) + Ag? Py) (3-10) 
where (with the usual notations of J(u, p) functions, as defined by Karl Pearson) 

















Py(S) = 1—I(u, po), the normal-theory value, (3-11) 
nl 
PCa) = gi fy (a Pa) — 2 4 Ps) + I (tor Po)} (3-12) 
<n y _ 
~ 8(m, +1) T( p+ | ~ (Dot a ata 
~My (0 —1 

Pas Ge) = FE egy L(t Ps) — BU Oy Pa) + 31 (Wy, Px) Hoy Pa)} (818) 
_ —%%(n,— 1) eee __ yy y 13 bis 
—T2(m +1) P(p+2)\" (+2) * (Met2)(Hra)> 1S) 


having 
Ug = (42) So, Uy = f[my/(M1+2)] Up, Ug = al[y/(n,+4)]u, Us = a[4/(n, + 6)] ae 
Po= 3% -1, pp=4my, Pe= }ny,+1, py = }hn,+2 
(3-14) 


and Y = Uy (Pot 1) = Ug V( M9 + 2) ‘ 

= Up Po = u_s4(Py—1) =... (3-15) 
The tail areas of £” = n4/y (the variance ratio for n, degrees of freedom when n, = 00) beyond 
any typical point %, are ulso obtainable from (3-10) when %, n,, Aj, A3? are respectively 





—~ 


Lined 





ne 


-_ 


A. K. GayvEn 249 


replaced by 1/£7, m2, Aq, Ag2, and obvious changes of sign are made in formulae (3-11)-(3-15). 
Maxine Merrington & Catherine M. Thompson’s (1943) percentage points of F, the normal- 
theory variance ratio, have been used for the tabulation work. 






































Pi, (Vo) 
0-05 
004+ n,=24 -— 
m=6 
0-03 Pei 
m=3 
0-02} —_ 
0-01- ny=1 
Igtee 2  » 4  » 40 a a 
a a aN | i ceoeeowrnaete GN. ig 
N= 3 8 8 eee SSS S SS SSS ==F= 
NEz57 == —— 3, 
-001 trizé % 
Pa (vo) 


Fig. 3. Showing the comparative values of Py,(v») and P}j(v»), at the 5% points of the 
normal-theory v, for certain values of n, and 7p. 























001 x Scene a 
m=3 ba e. —_-4--= 
- 2 4 6 vd Se wie Se omen Se = = &: 
‘Gye a 20 30 2 Ee: Se cel le 
ae" wee _4 m=24 
= aaa 1 
—0-01+ 
002" 
Py 
3 


Fig. 4. Showing the comparative values of P x.(0) and P,3(v») at the 5% points of the 
normal-theory v, for certain values of n, and ng, 


—— Py); ---- Pyro). 








250 Variance ratio in random samples from non-normal universes 


(3-2) The applicability of the distribution of v 

The frequency function of v, (3-1), will have in common some of the asymptotic properties 
previously pointed out to have en possessed by p(w) of (2-22). R. C. Geary’s formula (2-4) 
for the first two moments of z tor any population shows that A, and similar order 1’s can 
first occur in terms in n~*; so that our expression (3-1), although it contains the corrective 
terms in A, and Aj only, may furnish a fair approximation to the actual distribution of v for 
any population, provided the samples are not too small. The probabilities, P(v)), for n, = 4 
and n, = 20 degrees of freedom at vy = 2-8661, have been tabulated (Table 3), as before, for 
a wide range of values of Aj and A,, on the assumption that the two populations are of the same 
shape. The experimental results of Pearson’s (1931) Table VI are in most cases found to be 
in fair correspondence with our theoretical calculations. 


Table 3. Showing the comparative values of P(vy) = | p(v) dv, at vy = 2-8661, 
Vo 


the normal-theory 5% peint, for n, = 4.and n, = 20 degrees of freedom 








2 
IN : 0-00 0-25 0-50 0-75 1-00 1-50 2-00 
Ay 

~15 0-0112 0-0099 0-0089 

~1-0 0-0241 0-0229 0-0216 0-0203 0-0190 

~0-5 0-0371 0-0358 0-0345 0-0332 0-0320 0-0294 
0-0 0-0500 0-0487 0-0474 0-0462 0-0449 0-0423 0-0398 
0-5 0-0629 0-0617 0-0604 0-0591 0-0578 0-0553 0-0527 
1-0 0-0759 0-0746 0-0733 0-0720 0-0707 0-0682 0-0656 
15 0-0888 0-0875 0-0862 0-0850 0-0837 0-0811 0-0786 
2-0 0-1017 01004 0-0992 0-0979 0-0966 0-0940 0-0915 
2-5 0-1147 01134 0-1121 0-1108 01095 01070 0-1044 
































Example 1. If in the illustration cited by Pearson & Neyman (1930) the two series of ten 
cephalic indices have £, = 3-5 (as has been inferred), then at vy = 3-1790, the normal-theory 
5 % point, the true probability, as calculated from (3-8) and Table 4, is found to be 0-048. 
Since the sample data give v = 8-328, which lies far above v,, the conclusion that the samples 
are from populations with the same variance is obviously not acceptable. 

The values of P,,(v») and P,:(v9) indicate that, in applications, a distinction should always 
be made between samples from the same population and those from two different popula- 
tions. The kurtosis and the skewness appear to have some balancing effect on the tail pro- 
babilities at the 5% points of both the test functions v and w; a comparison of the expres- 
sions (2°29) and (3-7) suggests that it may also be generally true. The values of P;-(vy) are 
in most cases positive while those of P7(v9) are negative, the reverse being the situation with 
P, (wo) and P,x(wp). It is therefore not unlikely that the presence of deviations from normality 
in both the measures and in both the parent distributions may sometimes, far from dis- 
turbing the normal-theory law of the variance ratio, contribute towards its stability. The 
point is illustrated in the example considered below. 


Example 2 (from 8. J. Pretorious, 1930). Frequency distributions of barometric heights at 
Greenwich for a period of 79 years show for the summer months Aj? = 0-2016 and Aj = 0-2744, 


Th 





— 


A, K. GayvENn 251 


and for the winter months A3? = 0-1422 and A, = —0-1385. Would it be justifiable to con- 
sider from the following two samples of nine and thirteen observations, taken from the 
summer and winter months respectively, that the variability of barometric anges in the 
winter months is greater than that in the summer? 


Sample I (summer months—barometric height in inches): 29-85, 29-75, 29-15, 30-05, 
29-55, 30-35, 30-05, 30-25 and 29-35. 


Sample JJ (winter months): 30-55, 28-45, 30-45, 30-15, 29-15, 29-95, 30-65, 28-85, 30-25, 
30°35, 30-55, 29-55 and 29-05. 

Here v = 3-3685 for n, = 12 and n, = 8 degrees of freedom. ‘The probability at v, = 32840 
(the normal-theory 5 % point), as calculated from (3-8) and Table 4, is 


P(v,) = 0-0500 + (9:0680) (0-0257) + (0-2065) (—0-0140) - 
— (0-1719) (— 0-0078) — (0-0297) (— 0-0072) 
= 0:0504. 


Hence this point can also be regarded as the 5 % point of v for the present populations. 
Since the sample value of v fails beyond this point, the difference in variances must be regarded 
as significant at the 5 % level. 


SUMMARY 


The mathematical forms of the frequency distributions of the variance ratio used for testing 
(i) the homogeneity of a set of means, in the case of a one-way classification for analysis of 
variance with equal or unequal numbers (the test function denoted here by w) and (ii) the 
compatibility of two variances (the test function, v) have been derived for samples drawn 
from non-normal universes characterized by a priori values of A,(= ./6;), a measure of skew- 
ness, and A,(= £,—3), a measure of excess, and expressed by the first four terms (including 
the term in A3) of the Edgeworth series. The formulae which have furnished the corrective 
terms in A, and Aj, in addition to the normal-theory function in each case, are (a) sufficiently 
accurate for any size of sample from populations agreeing well with the terms up to Aj of 
the Edgeworth series, and (b) valid asymptotically for any form of universe. Thus it is not 
unlikely that for moderate sizes of sample the distributions have quite an extended range of 
applicability, provided always that the higher order even-cumulants are small. 

The expressions deduced for the corrective tail areas and the tabulated results in certain 
cases (Table 4) will enable us to examine the true probabilities of the test functions for 
a priort values of the universal A, and A,. Where the exact knowledge of these parameters 
is not available, we may sometimes safeguard against errors by considering the corrections 
for their plausible values. Practical applications have been considered and diagrams have 
been constructed to show the comparative effects of parental skewness and excess on the 
normal-theory upper 5 % tail probabilities for a few representative samples. 


I should like to acknowledge gratefully the help and advice I have received from Dr H. E. 
Daniels in the course of my investigations. 








Table 4. Corrective functions for determining the tail probabilities of 
(i) w and (ii) v at their upper 5 % normal-theory significance levels 












































A set of k samples Two independent samples 
5% (test for homogeneity of a set of means) (test for compatibility of two variances) 
v points 
; of z 
Pi,(wo) Pi,(Wo) Pyx(Wo) | P73x(wo) | P ‘X(Vo) P. ‘aa(o) Py) P. a3(Yo) 
Vy => 1 
1 | 2-5421 | —0-0027 | —0-0055 0-0054 | 0-0055 | 0-0021 | —0-0042 0-0000 0-0000 
2 | 1-4592 | —0-0048 | —0-0060 0-0084 | 0-0047 | 0-0066 | —0-0077 | —0-0022 | —0-0022 
3 | 1-1577 | —0-0052 | —0-0052 0-0073 | 0-0037 | 0-0096 | —0-0091 | —0-0034 | —0-0034 
4 | 1-0212 | —0-0048 | —0-0042 0-0055 | 0-0031 | 0-0110 | —0-0093 | —0-0037 | —0-0037 
5 | 0-9441 | —0-0043 | —0-0034 0-0039 | 09-0028 | 0-0116 | —0-0089 | —0-0035 | —0-0035 
6 | 0-8948 | —0-0038 | —0-0028 0-0028 | 0-0025 | 0-0117 | —0-0083 | —0-0032 | —0-0032 
8 | 0-8355 | —0-0030 | —0-0020 0-0015 | 0-0021 | 0-0114 | —0-0070 | —0-0025 | —0-0025 
12 | 0-7788 | —0-0020 | —0-0012 0-0005 | 0-0016 | 0-0103 | —0-0048 | —0-0016 | —0-0016 
20 | 0-7352 | —0-0011 | —0-0006 0-0001 | 0-0011 | 0-0086 | —0-0021 | —0-0008 | —0-0008 
24 | 07246 | —0-0009 | —0-0005 0-0000 | 0-0010 | 0-0080 | —0-0013 | —0-0006 | —0-0006 
30 | 0-7141 | —0-0007 | —0-0004 | — 0-0000 0-0008 | 0-0074 | —0-0004 | —0-0004 | —0-0004 
40 | 0-7037 | —0-0005 | —0-0003 | —0-0000 | 0-0006 | 0-0067 0-0006 | —0-0002 | —0-0002 
60 | 0-6933 | —0-0003 | —0-0002 | —0-0000 | 0-0004 | 0-0059 0-0016 | —0-0001 | —0-0001 
120 | 0-6830 | —0-0002 | —0-0001 | —0-0000 | 0-0001 | 0-0050 0-0028 | —0-0001 | —0-0001 
co | 0-6729 | —0-0000 | —0-0000 | —0-0000 | 0-0000 | 0-0040 0-0040 | —0-0000 | —0-0000 
yy = 2 
1 | 2-6479 | —0-0023 | —0-0029 0-0053 | 0-0031 | 0-0021 | —0-0042 0-0017 | —0-0017 
2 | 1-4722 | —0-0043 | —0-0032 0-0091 | 0-0026 | 0-0071 | —0-0079 | —0-0024 | —0-0024 
3 | 1-1284 | —0-0050 | —0-0029 0-0094 | 0-0019 | 0-0112 | —0-0095 | —0-0042 | —0-0037 
4 | 0-9690 | —0-0052 | —0-0026 0-0082 | 0-0015 | 0-0142 | —0-0099 | —0-0052 | —0-0042 
5 | 0-8777 | —0-0050 | —0-0023 0-0066 | 0-0012 | 0-0159 | —0-0095 | —0-0055 | —0-0039 
6 | 0-8188 | —0-0047 | —0-0020 0-0053 | 0-0010 | 0-0170 | —0-0087 | —0-0055 | —0-0035 
8 | 0-7475 | —0-0041 | —0-0015 0-0033 | 0-0009 | 0-0179 | —0-0068 | —0-0052 | —0-0024 
12 | 0-6786 | —0-0030 | —0-0010 0-0013 | 0-0007 | 0-0181 | —0-0033 | —0-0046 | —0-0005 
20 | 0-6254 | —0-0019 | —0-0006 0-0002 | 0-0005 | 9-0171 0-0012 | —0-0040 0-0014 
24 | 0-6123 | —0-0016 | —0-0005 0-0000 | 0-0005 | 0-0167 0-0026 | --0-0039 0-0019 
30 | 0-5994 | —0-0013 | —0-0004 | —0-0001 | 0-0004 | 0-0161 0-0042 | —0-0038 0-0024 
40 | 0-5866 | —0-0010 | —0-0003 | —0-0001 | 0-0003 | 0-0154 0-0060 | —0-0038 0-0029 
60 | 0-5738 | —0-0006 | —0-0002 | —0-0001 | 0-0002 | 0-0146 0-0079 | —0-0038 0-0034 
120 | 0-5611 | —0-0003 | —0-0001 | —0-0001 | 0-0001 | 0-0136 0-0101 | —0-0039 0-0038 
co | 0-5486 | —0-0000 | --0-0000 | —0-0000 | 0-0000 | 0-0124 0-0124 | —0-0042 0-0042 
Vy = 3 

1 | 26870 | —0-0020 | —0-0020 0-0051 | 0-0022 | 0-0022 | —0-0041 0-0018 | —0-0018 
2 | 1-4765 | —0-0037 | —0-0022 0-0091 | 0-0019 | 0-0074 | —0-0080 | —0-0024 | —0-0025 
3 | 1-1137 | —0-0046 | —0-0020 0-0102 | 0-0014 | 0-0120 | —0-0099 | —0-0046 | —0-0040 
4 | 0-9429 | —0-0049 | —0-0018 0-0095 | 0-0010 | 0-0154 | —0-0105 | —0-0058 | —0-0046 
5 | 0-8441 | —0-0049 | —0-0016 0-0082 | 0-0008 | 0-0178 | —0-0101 | —0-0063 | —0-0046 
6 | 0:7798 | —0-0047 | —0-0014 0-0069 | 0-0006 | 0-0194 | —0-0093 | ~0-0064 | —0-0042 
8 | 0-7014 | —0-0042 | —0-0012 0-0047 | 0-0005 | 0-0213 | —0-0073 | —0-0062 | —0-0031 
12 | 0-6250 | —0-0034 | —0-0008 0-0022 | 0-0003 | 0-0225 | —0-0031 | —0-0056 | —0-0010 
20 | 0-5654 | —0-0023 | —0-0005 0-0005 | 0-0003 | 0-0225 0-0026 | —0-0050 0-0015 
24 | 0-5508 | —0-0019 | —0-0004 0-0002 | 0-0002 | 0-0222 0-0046 | —0-0049 0-0022 
30 | 0-5362 | —0-0016 | —0-0003 0-0000 | 0-0002 | 0-0218 0-0067 | —0-0048 0-0029 
40 | 0-5217 | —0-0012 | —0-0002 | -—0-0002 | 0-0002 | 0-0213 0-0092 | —0-0048 0-0036 
60 | 0-5073 | —0-0008 | —0-0002 | —0-0002 | 0-0001 | 0-0206 0-0119 | —0-0049 0-0043 
120 | 0-4930 | —0-0004 | —0-0001 | —0-0002 | 0-0001 | 0-0196 0-0150 | —0-0051 0-0050 
co | 0-4787 | —0-0000 | —0-0000 | —0-0000 | 0-0000 | 0-0185 0-0185 | —0-0056 0-0056 












































The approximate true probabilities are given by: 
(i) P(w) = 0-05 + Ag P),(we) + Ak*P),(w,)} + AHPy3(wo) + AK*P43(w,)}, where Ak? is defined in (2-27), 
(ii) P(v9) = 0-05 + (AG + As) Px(v9) — HAs? + As®) Palo) + HAG — Ad) Pag(¥o) — HAS —Ag*) Pat(vg). 


. 





Table 4 (cont.) 






























































A set of k samples Two independent samples 
5% (test for homogeneity of a set of means) (teet for compatibility of two variances) 
Vv, | points 
of z 
P, rao) P4,(wWo) P. 13(Wo) P5x(wo) | P: j(Yo) t¥ rao) P Ao) P ai(o) 

Vy = 4 
1 | 2-7071 | —0-0017 | —0-0015 0-0050 | 0-0017 | 0-0023 | —0-0040 0-0016 | —0-0016 
2 | 1-4787 | —0-0032 | —0-0016 0-0090 | 0-0015 | 0-0075 | —0-0080 | —0-0025 | —0-0025 
3 | 1-1051 | —0-0041 | —0-0015 0-0105 | 0-0011 | 0-0123 | —0-0102 | —0-0047 | —0-0042 
4 | 0-9272 | —0-0044 | —0-0014 0-0102 | 0-0008 | 0-0160 | —0-0110 | —0-0060 | —0-0050 
5 | 0-8236 | —0-0045 | —0-0012 0-0093 | 0-0006 | 0-0187 | —0-0108 | —0-0066 | —0-0051 
6 | 0-7558 | —0-0045 | —0-0011 0-0081 | 0-0004 | 0-0206 | —0-0101 | —90-0068 | —0-0049 
8 | 0-6725 | —0-0041 | —0-0009 0-0059 | 0-0003 | 0-0231 | —0-0081 | —0-0067 | —0-0039 
12 | 0-5907 | —0-0034 | —0-0006 0-0031 | 0-0002 | 0-0251 | —0-0037 | —0-0060 | —0-00)8 
20 | 0-5265 | —0-0024 | —0-0004 0-0010 | 0-0001 | 0-0259 0-0029 | —0-0051 0-0008 
24 | 0-5106 | —0-0021 | —0-0003 0-0005 | 0-0001 | 0-0258 0-0052 | —0-0049 0-0016 
30 | 0-4947 | —0-0017 | —0-0003 0-0002 | 0-0001 | 0-0256 0-0078 | —0-0048 0-0024 
40° | 0-4789 | —0-0013 | —0-0002 | —0-0001 | 0-0001 | 0-0252 0-0108 | —0-0048 0-0033 
60 | 0-4632 | —0-0009 | —0-0001 | —0-0002 | 0-0001 | 0-0247 0-0143 | —0-0049 0-0041 
120 | 0-4475 | —0-0005 | —0-0001 | —0-0002 | 0-0000 | 0-0239 0-0183 | —0-0051 0-0049 
co | 0-4319 | —0-0000 | —0-0000 | —90-0000 | 0-0000 | 0-0228 0-0228 | —0-0056 0-0056 

y,=5 
1 | 27194 | —0-0015 | —0-0012 0-0048 | 0-0014 | 0-0024 | —0-0039 0-0013 | —0-0013 
2 | 1-4800 | —0-0029 | —0-0013 0-0089 | 0-0012 | 0-0676 | —0-0080 | —0-0025 | —0-0025 
3 | 1-0994 | —0-0037 | —0-0012 0:0107 | 0-0009 | 0-0124 | —0-0104 | —0-0048 | —90-0044 
4 | 0-9168 | —0-0040 | —0-0011 0-0107 | 0-0007 | 0-0163 | —0-0114 | —0-0061 | —0-0053 
5 | 0-8097 | —0-0042 | —0-0010 0-0100 | 0-0005 | 0-0191 | —0-0114 | —0-0068 | —0-0056 
6 | 0-7394 | —0-0042 | —0-0009 0:0089 | 0-0004 | 0-0213 | —0-0109 |} —0-0071 | —0-0054 
8 | 0-6525 | —0-0040 | —0-0007 0-0069 | 0-:0003 | 0-0241 | —0-0090 | —0-0069 | —0-0046 
12 | 0-5666 | —0-0034 | —0-0005 0-0040 | 0-0001 | 0-0268 | —0-0046 | —0-0061 | —0-0026 
20 | 0-4986 | —0-0025 | —0-0003 0-0014 | 0-0001 | 0-0282 0-0026 | —90-0050 0-0000 
24 | 0-4817 | —0-0021 | —0-0003 0-0009 | 0-0001 | 0-0283 0-0051 | —0-0047 0-0008 
30 | 0-4648 | —0-0018 | —0-0002 0-0004 | 0-0001 | 0-0282 0-0081 | —0-0046 0-0017 
40 | 0-4479 | —0-0014 | —0-0002 0-0001 | 0-0001 } 0-0280 0-0116 | —0-0044 0-0026 
60 | 0-4311 | —0-0010 | —0-0001 | —0-0001 | 0-0000 | 0-0276 0-0156 | —0-0044 0-0035 
120 | 0-4143 | —0-0005 | —0-0001 | —0-0002 | 0-0000 | 0-0269 0-0204 | —0-0047 0-0044 
co | 0-:3974 | —0-0000 | —0-0000 | —0-0000 | 0-0000 | 0-0259 0:0259 | —0-0052 0-0052 

Vy = 6 
1 |: 2-7276 | —0-0014 | —0-0010 0-0048 | 0-0011 | 0-0024 | —0-0038 0-0011 | —0-0011 
2 | 1-4808 | —0-0026 | —0-0011 0-0088 | 0-0010 | 0-:0076 | —0-0080 | —0-0025 | —0-0025 
3 | 10953 | —0-0033 | —0-0010 0-0107 | 0-0008 | 0-0125 | —0-0106 | —0-0048 | —0-0045 
4 | 0-9093 | —0-0037 | —0-0009 0:0110 | 0-0006 | 0-0164 | —0-0118 | —0-0062 | —0-0055 
5 | 0-7997 | —0-0039 | —6-0008 0-0104 | 0-0005 | 0-0194 | —6-0120 | —0-0069 | —0-0059 
6 | 0-7274 | —0-0039 | —0-0008 0-0096 | 0-0004 | 0-0217'} —0-0116 | —0-0072 | —0-0059 
8 | 0-6378 | .—0-0038 | —0-0006 0-:0076 | 00002 | 0-0248 | —0-0099 | —0-0071 | —0-0052 
12 | 0-5487 | —0-0033 | —0-0005 0-0047 | 0-0001 | 0-9279 | —0-0055 | —0-0062 | —0-0034 
20 | 0-4776 | —0-0025 | —0-0003 0-0019 | 0-0001 | 0-0297 0-0019 | —0-0049 | —0-0007 
24 | 0-4598 | —0-0022 | --0-0002 0-0013 | 0-0001 | 0-0300 0-0047 ; —0-0046 0-0001 
30 | 0-4420 | —0-0018 | —0-0002 0-0007 | 0-0000 0-0301 0-0079 | —0-0043 0-0010 
40 | 0-4242 | —0-0014 | —0-0001 0-0003 | 0-0000 | 0-0301 0-0118 | —0-0040 0-0019 
60 | 0-4064 | —0-0010 | —0-0001 | —0-0000 | 0-0000 | 0-0298 0-0164 | —0-0039 0-0029 
120 | 0-3885 | —0-0005 | —0-0001 | —0-0000 | 0-0000 | 0-0292 0-0218 | —0-0041 0-0038 
co | 03706 | —0-0000 | —0-0000 | —0-0000 | 0-0000 | 0-0283 0-0283 | —0-0045 0-0045 












































Table 4 (cont.) 








A set of k samples 


Two independent samp:ss 

































































5% (test for homogeneity of a set of means) (test for compatibility of two variances) 
v, | Points 
of z : 
Py (Wo) | Pawo) | Palo) | Phalwo) | Px(ro) | Paro) | Paro) | Par(ro) 
y,=8 
1 | 2-7380 | —0-0011 | —0-0008 0-0046 | 0-0009 | 0-0026 | —0-0037 0-0008 | —0-0008 
2 | 1-4819 | —0-0021 | —0-0008 0-0087 | 0-0008 | 0-0077 | —0-0080 | —0-0025 | —0-0025 
3 | 10899 | —0-0028 | —0-0008 0-0108 | 0-0007 | 0-0125 | —0-0108 | —0-0048 | —0-0046 
4 | 0-8993 | —0-0031 | —0-0007 0-0113 | 0-0005 | 0-0165 | —0-0124 | —0-0063 | —0-0058 
5 | 6-7862 | —0-0033 | —0-0006 0-0111 | 0-0004 | 0-0195 | —0-0129 | —0-0071 | —9-0064 
6 | 0-7112 | —0-0034 | —0-0006 0-0104 | 0-0003 | 0-0220 | —0-0128 | —0-0075 | —0-0065 
8 | 0-6175 | —0-0034 | —0-0005 0-0088 | 0-0002 | 0-0254 | —0-0116 | —0-0074 | —0-0061 
12 | 0-5234 | —0-0030 | —0-0004 0-0059 | 0-0001 | 0-0291 | —0-0075 | —0-0064 | —0-0045 
20 | 0-4474 | —0-0024 | —0-0002 0-0028 | 0-0000 | 0-0317 0-0001 | —0-0047 | —0-0020 
24 | 0-4283 | —0-0021 | —0-0002 0-0020 | 0-0000 | 0-0322 0-0031 | —0-0042 | —0-0012 
30 | 0-4090 | —0-0018 | —0-0002 | 0-0013 | 0-0000 | 0-0326 0-0068 | —0-0037 | —0-0003 
40 | 0-3897 | —0-0015 | —0-0001 |_ 0-0007 | 0-0000 | 0-0328 0-0112 | —0-0033 0-0007 
60 | 0-3702 | —0-0010 | —0-0001 | 0:0002 | 0-0000 | 0-0328 0-0167 | —0-0030 0-0016 
120 | 0-3506 | —0-0006 | —0-0000 | —0-0001 | 0-0000 | 0-0324 0-0234 | —0-0029 0-0025 
co | 0-3309 | —0-0000 | —0-0000 | —0-0000 | 0-0000 | 0-0316 0:0316 | —0-0033 0-0033 
YW — 12 
1 | 2-7484 | —0-0008 | —0-0005 0-0045 | 0-0006 | 0-0027 | —0-0036 0-0005 | —0-0005 
2 | 1-4830 | —0-0016 | —0-0005 0-0084 | 0-0006 {| 0-0077 | —0-0079 | —0-0025 | —0-0025 
3 | 10842 | —0-0021 | —0-0005 0-0107 | 0-0005 | 0-0125 | —0-0112 | —0-0049 | —0-0047 
4 | 08885 | —0-0024 | —0-0005 | 0-0116 | 0-0004 | 0-0164 | —0-0131 | —0-0065 | —0-0062 
5 | 0-7714 | —0-0026 | — 0-0004 0-0117 | 0-0003 | 0-0195 | —0-0142 | —0-0073 | — 90-0069 
6 | 0-6931 | —0-0027 | —0-0004 ; 0-0114 | 0-0002 | 0-0220 | —0-0145 | —0-0078 | —0-0073 
8 | 0-5945 | —0-0027 | —0-0003 0-0101 | 0-0002 | 0-0257 | —0-0140 | —0-0078 | —0-0072 
i2 | 0-4941 | —0-0026 | —0-0002 0-0075 | 0-0001 | 0-0299 | —0-0109 | —0-0069 | —0-0060 
20 | 0-4116 | —0-0022 | —0-0002 0-0043 | 0-0000 | 0-0335 | —0-0036 | —0-0048 | —0-0037 
24 | 0-3904 | —0-0020 | —0-0001 0-0033 | 0-0000 | 0-0343 | —0-0005 | —0-0041 | —0-0028 
30 | 0-3691 | —0-0017 | —0-0001 0-0023 | 0-0000 | 0-0350 0-0035 | —0-0033 | —0-0020 
40 | 0-3475 | —0-0014 | —0-0001 0-0014 | 0-0000 | 0-0355 0-0087 | —0-0025 | —0-0010 
60 | 0-3255 | —0-0011 | —0-0001 0-0007 | 0-0000 | 0-0359 0-0153 | —0-0018 | —0-0001 
120 | 0-3032 | —0-0006 | —0-0000 0-0001 | 0-0000 | 0-0359 0-0240 | —0-0013 0-0007 
co | 0-2804 | —0-0000 | —0-0000 0-0000 | 00000 | 9-0354 0-0354 | —0-0013 0-0013 
V, = 24 
| | 
1 | 27588 | —0-0005 | —0-0003 | 0-0043 | 0-0003 | 0-0029 | —0-0034 0-0000 | —0-0000 
2 | 1-4840 | —0-0009 | —0-0003 | 0-0081 | 0-0003 | 0-0078 | —0-0079 | —0-0026 | —0-0026 
3 | 10781 | —0-0012 | —0-0002 0-0104 | 0-0003 | 0-0124 | —0-0116 | —0-0049 | —0-0049 
4 | 0-8767 | —0-0014 | —0-0002 0-0117 | 0-0002 | 0-0161 | —0-0142 | —0-0066 | —0-0065 
5 | 07550 ; —0-0015 | —0-0002 0-0122 | 0-0002 | 0-0192 | —0-0159 | —0-0077 | —0-0075 
6 | 0-6729 | —0-0016 | —0-0002 | 0-0122 | 0-0001 | 0-0217 | —0-0170 | —0-0082 | —0-0081 
8 | 0-5682 | —0-0017 | —0-0002 | 0-0116 | 0-0001 | 0-0254 | —0-0178 | —0-0086 | —0-0084 
12 | 0-4592 | —0-0017 | —0-0001 0-0097 | 0-0001 | 0-0300 | —0-0169 | —0-0079 | —0-0078 
20 | 0-3668 | —0-0016 | —0-0001 0-0067 | 0-0000 | 0-0344 | —0-0117 | —0-0058 | —0-0058 
24 | 0-3425 | —0-0015 | —0-000i 0-0056 | 0-0000 | 0-0355 | —0-0090 | —0-0049 | —0-0050 
| 

30 | 0-3176 | —0-0014 | —0-0001 | 0-004 « | 0-0000 | 0-0367 | —0-0051 | —0-0038 | —0-0041 
40 | 0-2920 | —0-0012 | —0-0000 | 0-0032 | 0-0000 | 0-0378 0-0004 | —0-0025 | —0-0032 
60 | 0-2654 | —0-0009 | —0-0000 | 0-0019 | 0-0000 | 0-0387 0-0084 | -0-0012 | —0-9022 
120 | 0-2376 | —0-0006 | —0-0000 0-0007 | 0-0000 | 0-0394 0-0204 0-0003 | —0-0014 
co | 02085 | —0-0000 | —0-0000 0-0000 | 0-0000 | 0-0396 0-0396 0-0013 | —0-00]3 



































The approximate true probabilities are given by: 
(i) P(we) = 0-05 + Ag{P4,(0) + Ak*P),(we)} + AZ{Pyx(wo) + Ak*P3x(wy)}, where Ak? is defined in (2-27), 


(ii) P( v9) = 0-05 + $(AG + AS) Pl) — HAS? + As?) Paavo) + H(AG— Aa) P,,(o) — (Az? —A;") Py3(%)- 




















A. K, GayvEn 255 


Table 4 (cont.) 





























: ah Two independent samples 
Capp wih 1 = 0 (test for compatibility of two variances) 
Ve 5% points ofz | Px(vo) | Paro) | Paro) | Pas(re) 
1 2-7693 0-0031 0-0000 
2 1-4851 0-0079 ‘—0-0026 
3 1-0716 0-0121 — 0-0050 
4 0-8639 0-0156 — 0-0069 
5 0-7368 0-0184 * —0-0081 
6 0-€499 0-0206 — 0-0090 
8 0-5371 0-0239 —0-0101 
12 0-4156 0-0281 — 0-0103 
20 0-3057 0-0324 — 0-0096 
24 0-2749 0-0336 —0-0091 
30 0-2419 0-0349 — 0-0085 
40 0-2057 0-0363 — 0-0076 
60 0-1644 0-0379 — 0-0063 
120 0-1131 0-0396 —0-0044 
co 0-0000 . oo 
REFERENCES 


EpEn, T. & Yatss, F. (1933). J. Agric. Sci. 23, 6. 

GaveEn, A. K. (1949). Biometrika, 36, 353. 

Geary, R. C. (1947). Biometrika, 34, 209. 

KENDALL, M. G. (1946). The Advanced Theory of Statistics, 2. London: Charles Griffin and Co. 

MERRINGTON, MAxINE & THOMPSON, CATHERINE M. (1943). Biometrika, 33, 73. 

Pearson, E. S. (1931). Biometrika, 23, 114. 

Pearson, E. 8. & Neyman, J. (1930). Bull. int. Acad. Cracovie, A, p. 73. 

Pretoriovs, S. J. (1930). Biometrika, 22, 109. 

Soper, H. E. (1921). The Numerical Evaluation of the Incomplete B-function. Tracts for Computers, 
No. viz. Cambridge University Press. 

THompson, CATHERINE M. (1941). Biometrika, 32, 151. 











[ 256 ] 


THE COMPARISON OF PERCENTAGES IN MATCHED SAMPLES 


By W. G. COCHRAN 
Department of Biostatistics, School of Hygiene and Public Health, Johns Hopkins University 


1. INTRODUCTION 


The x? test has long been used to test the significance of differences between ratios or per- 
centages in two or more independent samples. It sometimes happens that each member of a 
sample is matched with a corresponding member in every other sample, in the hope of securing 
a more accurate comparison among the percentages. The matching may be based either on 
the characteristics of the members, or on the fact that the partners in a group are subjected 
to some test that is the same for all members of the group but varies from one group to 
another. 

Since the matching may introduce correlation between the results in different samples, 
it invalidates the ordinary y* test, which gives too few significant results if the matching is 
effective. For the case where there are two samples, an appropriate test is easily constructed. 
An example has been given by McNemar (1949), who presents this test. In his data, 205 
soldiers were asked whether they thought that the war against Japan would last more or 
less than a year. They were subsequently asked the same question after a lecture on the 
difficulties of the war against Japan. Matching occurs because each sample contains exactly 
the same soldiers. 

The replies may be classified in a 2 x 2 frequency table as shown in Table 1. 


Table 1. Comparison of ratios in two samples 














After lecture 
Total 
Less More 
Before lecture: Less 36 (a) 34(b) 70 
More 0(c) 135 (d) 135 
36 169 205 




















Before the lecture, 70 men out of the 205 thought that the war would last less than a year, 
whereas after the lecture this number has dropped to 36. The comparison which we wish to 
make is that between the two frequencies 70/205 and 36/205. There are several ways in which 
the test may be derived. Perhaps the easiest is to note that both numerators, 70 and 36, 
contain the 36 (a) men who persisted in thinking that the war would last less than a year. 
Hence, equality of the numerators would imply that the same number of men changed from 
“Less’ to ‘More’ as changed from ‘More’ to ‘Less’, In other words, if the lecture is without 








eff 
an 


ear, 
h to 
hich 
36, 
ear. 
rom 
10ut 








W. G. CocHRan 257 


effect we would expect half the persons who changed their minds to change in one direction 
and half in the other. Thus the test can be made by testing whether the numbers (6) and 


(c) are binomial successes and failures out of n = (+c) trials, with probability }. For this 





_ (6—4n)?  (c—4n)? _ (b-c}*? _ (34-0)? _ 
v= in is jn ib +e 344-0 = 


with 1 degree of freedom. A correction for continuity can be applied by subtracting 1 from 
the absolute value of the numerator before squaring. 

The two-sample case has also been discussed in a study by Denton & Beecher (1949), 
where the object was to find out whether subjects reacted more frequently to an injection of 
a new drug than they did to one of isotonic sodium chloride, which was used as a control. 
They give a x? test, attributed to Mosteller, which differs slightly from that given above. 

The object of this paper is to extend the test to the situation where there are more than 
two samples. An example is provided in some studies of variability among interviewers in 
sample surveys. Each interviewer called at a different group of houses, but any house assigned 
to an interviewer was matched with one of the houses assigned to each other interviewer 
according to the characteristics of the housewife. A test of whether the percentage of ‘yes’ 
answers to some question differed from interviewer to interviewer is a test of the type that 
we are considering. ' 

In « second example, the effectiveness of a number of different media for the growth of 
diphtheria bacilli was investigated by the Communicable Disease Centre, U.S. Public Health 
Service. In one series, specimens were taken from the throats of sixty-nine suspected cases, 
Each specimen was grown on each of four media A, B, C, D. The probability that growth 
takes place will depend on the number of diphtheria bacilli present, and in a number of 
cases there might well be no bacilli present. 


Table 2. Method of presentation suited to more than two columns 

















Diphtheria media Soldiers’ replies 

| No. of No. of 
A B Cc D | cases Before | After cases 
1 1 1 1 4 1 1 135(d) 
1 1 0 1 2 0 1 34(b) 
0 1 ] 1 3 0 0 36 (a) 
0 1 0 1 1 
0 0 0 0 59 


Totals 135 169 























Totals (7,) | 6 | 10 | 7 | 10 


























Results are shown in Table 2.* Where there are four media, the 2 x 2 table does not seem 
well adapted to a succinct presentation. Instead, each medium is allotted to one column of 
the table. A 1 denotes that growth occurred with that medium, a 0 that no growth occurred. 
Thus in Table 2 there were four specimens in which all four media exhibited growth, two 
specimens in which media A, B and D, but not C, showed growth, and so on. To illustrate 

* I wish to thank Dr Martin Frobisher, Chief, Bacteriology Laboratories, Communicable Disease 
Centre, U.S. Public Health Service, Atlanta, for permission to use these data for illustration. 

Biometrika 37 17 











258 Comparison of percentages in matched samples 


the relation to the method of presentation in a 2 x 2 table, McNemar’s results are also shown 
in this form, where a 1 denotes the answer ‘more than a year’. 

The column totals are the total numbers of 1’s. The problem is to test whether these totals 
differ significantly among media. 


2. MATHEMATICAL FRAMEWORK 


For a discussion of the theory of the test we shall adopt a less concise method of presentation 
than that given in Table 2. Each matched group will be placed in a different row of the table. 
Thus the table for the diphtheria data would contain 69 rows and 4 columns. The probability 
of a 1 is presumed to vary from row to row, usually in a manner that is known only vaguely. 
Nevertheless, an exact test can be developed by the familiar method in which the population 
is generated by randomization. The observed total number wu, of successes (1’s) in the ith 
row is regarded as fixed. If the null hypothesis is true, every one of the c columns is con- 
sidered equally likely to obtain one of these successes. The population of possible results in 


the ith row consists of the ( ‘) ways in which the wu; successes can be distributed among the 
¢ columns. se, 

This specification has one consequence that might be questioned. If a row contains no 
successes, or c successes, the population generated in that row consists only of the single 
case that actually occurred. As will be seen, this implies that such rows play no part in the 
test of significance. This is evident in the two-sample test, which makes no use of the number 
of cases a and d in the cells of Table 1 where there was no change of opinion. On the other 
hand, for giver values of 6 and c, one might feel intuitively that significance ought to be more 
definitely established if there are no cases in which the samples give the same result (i.e. a 
and d are zero) than if there are a large number of such cases. Whether this feeling is sound is 
perhaps debatable, and I do not see how weight can be given to it without losing the advantage 
of an exact test. 

The test criterion that will be used is £(7;—T7')*, where 7; is the total number of successes 
in the jth column. This is the same criterion as in the ordinary x? test for the situation where 
the columns are independent. It may not be the best criterion. For the usefulness of the 
data from a row for the purpose of detecting differences among columns may depend on the 
probability of success in the row. That is, the situation may be similar to that which occurs 
in dosage-mortality experiments, in which, for maximum sensitivity per observation, 
comparisons of two drugs must be made close to the median lethal doses. This suggests that 
in extensive data it might be advisable to group the rows according to the value of u; and 
to perform some kind of weighting on the 7; values for different values of u;. I have occasion- 
ally used this approach, but it may be difficult to decide what form the weighting should take, 
particularly in a new type of experimentation. A test based on the unweighted totals will 
often serve our purpose. 

3. THE LIMITING DISTRIBUTION 


We consider first the limiting distribution of the test criterion when the number of rows r is 
very large. Let the variate z,; take the value 1 if there is a success in the cell in the ith row 
and jth column, and 0 if there is a failure. By the properties of the randomization in that 
row, these two events occur with probabilities u,;/c and 1 —u,/c, respectively. Hence 


i o%(x,,) = “(1-"). 


E(x;;) = r 











hown 


Otals 


} 





W. G. CocHRAN 259 


By symmetry, the covariance is the same for any two cells in the same row. Since the row 
total of the x,, is fixed at w, and thus has zero variance, the covariance of x,; and x, is found 
3 (0-3 
c c 
(c—1) 

These results enable us to arrive, by non-rigorous methods, at the form of the limiting 
distribution. Since the randomization is independent in different rows, the means, variances 
and covariances of the column totals 7; will be corresponding expressions above, summed 
over the rows. If the number of rows is large, the joint distribution of these totals may be 
expected to tend to the multivariate normal. Finally, if a set of c variates 7; follow a multi- 
variate normal distribution with common variance o? and common covariance po, it is 
well known that 2(7;—T)? is distributed as x*e7(1—/), with (c—1) degrees of freedom 
(Walsh, 1947). In this case 


COV (%45%iq) = (j +k). 


2 yl, _ % — —— 
ot = Ee(I ‘); Pe ey 
1 U; 
iS daa —>- ep Se 
so that o*(1—p) ay ee() “). 


Hence, when the number of rows is large, 
i T.—-T)2 ie 
(-NSG—AY _ (e-1) 5(T,-7)* 


rr PTT (1) 
Zu, (1-") (Su) — (a) 


is distributed as x? with (c—1) D.F. 

A rigorous proof of this result may be obtained by the method developed by Hsu (1949) 
and will not be given here. The only restriction needed is a rather obvious one, to guard 
against the possibility that as the number of rows tends to infinity, the value of u; might be 
cor 0 in all but a finite number of rows. If this happens, the size of the population is still finite 
in the limit, because permutations within rows having wu; = c or 0 do not generate any new 
cases. This situation is avoided by stipulating that for at least one intermediate value of u,, 
the number of rows having that value must tend to infinity. 

When there are only two samples (c = 2) the test reduces to that given in §1. If a 1 
denotes a reply of ‘more than a year’ it will be seen from Table 1 that 


T,=c+d, T,=b+d, 
Lu, =b+c+2d, Lu? = b+c+ 4d. 


From (1) Q becomes Q= ee Dy ats 





Q= 





4, COMPARISON WITH THE ORDINARY x? TEST 
In the ordinary x? test, valid when the samples are independent, we have 


ST,- 7) 


v= (RE (2) 


c 
where @ = Lu,/r. 
17-2 











260 Comparison of percentages in matched samples 


Under what conditions does this test coincide with the new (Q) test? It might be anti- 
cipated that this should happen when the probability of success does not change from row 
to row. The results are in line with this expectation. 

Consider the application of both tests to a series of tables, all of which have the same set 
of row totals. From (1) and (2) the new test gives a greater, an equal or a smaller number of 
significant results than the ordinary test, according as 


Zu ( —"s) 2-0" (1-2). (3) 


Since Xu; = ru, the left-hand side may be expressed as 


(1) 2h 





It follows that relations (3) are equivalent to 
ra 2 =) =5(u,;—%)?. (4) 


If we wish to test the null hypothesis that the probability of success is the same in all 


rows, this could be done by an ordinary x? test on the row totals u,. Since rows are independent, 
the value of x? would be 


with (r—1) degrees of freedom. Thus relations (4) can be written 

rx (5) 
The expected value of x? is (r—1), which is indistinguishable from r if the number of rows 
is large. Thus the equality in relations (5) is satisfied when the value of x? in a test on the row 


totals is just about equal to its expectation. The analysis also shows that the Q test gives 


more significant results when x? exceeds its expectation, and fewer significant results when 
x2 is below expectation. 


5. APPLICATION TO THE EXAMPLE 
In the example (Table 2) we have c = 4, 


= (%,-Ty = 624 102+ 724 102—(33)2/4 = 12-75. 


To find the denominator of Q, a separate frequency distribution of the values of the row 
totals u; may be made. 


Value of u, Frequency 


4 + 
3 5 
2 1 
0 59 


Buy = 21, = 33, Xu? = 113. 
j 


(4) (3) (12°75) 
(4) (33) — (113) 


with 3 degrees of freedom, corresponding to a probability of 0-045. 


Hence from (1) Q= = 8-05, 








EE ied 





nti- 
row 


> set 
ar of 


(3) 


(4) 


a all 
ent, 


(5) 
rOWS 
row 
fives 
yhen 


row 














— 


a 





W. G. CocHRAN 261 


It is easy to show algebraically, as may be verified in this example, that the value of Q 
is not altered if we omit all rows in which wu; = c or 0. In this respect, Q behaves as we should 
expect a test based on the randomization process to behave. 

In the example, 63 of the 69 rows have 4 or 0 successes, so that only 6 rows really con- 
tribute to the frequency distribution of Q. It may be doubted whether a limiting distribution 
which assumes a large number of such rows can be applied here. This question is discussed 
in §6. 

: 6. THE DISTRIBUTION OF Q IN SMALL SAMPLES 
If there are only two columns, an exact small-sample test presents no difficulty. The Q test 
is essentially equivalent to the sign test (Cochran, 1937; Dixon & Mood, 1946), for which 
tables are available in the references cited.* This can be seen by the argument used in § 1. 
Apart from its divisor, Q is (7, — T,)?, ie. the square of the difference between the number of 
successes in the two columns. We may ignore all rows that contain either 2 or 0 successes, 
since these do not affect the value of Q. Consequently (T', — 7,) is the difference between the 
number 1, of rows in which the results are (1-0) and the number r, in which they are (0-1). 
If n = r,+1¢, this difference equals (27, —7). 

For any row that has one success, the probabilities of a (1-0) and of a (0-1) on the null 
hypothesis are both }. This shows that r, is distributed in the binomial (4+ 4)", which is the 
quantity that is tabulated in the sign test. 

For an exact test when c = 2, the procedure is therefore as follows: (i) ignore all rows with 
2 or 0 successes; (ii) count the number of rows with a single success in the first column, and 
refer to the tables of the sign test, where n is the totalnumber of rows that have one success. 

In the small tables that have more than two columns, the exact distribution of Q can be 
tabulated by enumerating all configurations generated by the randomization. Since the 
number of possible cases is large, a comprehensive listing of exact significance levels would 
be laborious to construct. As a check on the accuracy of the limiting distribution in small 
samples, the exact distribution of Q was worked out for the following eight cases: 

c=3,r=10; 2515, c=4,r= 6; 352. 

c=3,r=10; 219, c= 4,r=9; 3°2313, 

c=3,r=11; 21°, c=4,r=10; 332314, \ 
c=3,r= 16; 2115, c= 5,r= 8; 47322712, 

The figures following the semicolon are the uw; values: e.g. 2515 means that w, is 2 in five 
of the rows and 1 in the remaining five. No case in which u,; = c or 0 was included, since any 
number of such rows may be added to the basic table without affecting the value of Q. 

Some of the cases are rather closely related in their structure. Nevertheless, it seemed 
best to include all of them in presenting summary comparisons. The cases were chosen as 
indicative of the smallest samples in which the x? approximation to the distribution of Q is 
likely to be needed. Smaller samples can of course occur in practice, but in this event it is 
relatively easy to make an exact test of significance from the exact distribution of Q. 

The exact distribution was compared not only with the x? approximation, but also with 
an F-test applied to the data by means of an analysis of variance into the components 


D.F 
Rows (r—1) 
Columns (c—1) 
Rows x columns (r—1) (e—1) 


* This has been pointed out by Mosteller (1947). 











262 Comparison of percentages in matched samples 


where F is the ratio of the mean squares for columns and rows x columns. If the data had 
been measured variables that appeared normally distributed, instead of a collection of 1’s 
and 0’s, the F-test would be almost automatically applied as the appropriate method. 
Without having looked into the matter, I had once or twice suggested to research workers 
that the F-test might serve as an approximation even when the table consists of 1’s and 0’s. 
As a testimony to the modern teaching of statistics, this suggestion was received with 
incredulity, the objection being made that the F-test requires normality, and that a mixture 
of 1’s and 0’s could not by any stretch of the imagination be regarded as normally distributed. 
The same workers raised no objection to a x? test, not having realized that both tests require 
to some ex‘ent an assumption of normality, and that it is not obvious whether F or y? is 
more sensitive to the assumption. Inclusion of the F-test is also worth while in view of the 
widespread interest in the application of the analysis of variance to non-normal data. 

The total number of values in a population is sufficiently small so that correction for 
continuity makes an appreciable difference. Application of the correction requires a little 
inspection of the data. Usually the values of &(7;—T7)? increase by 2’s, but with c = 2 or 3 
the increase may be much greater, and it is necessary to discover what is the value of Q 
immediately below the one actually obtained, and enter the table with a value midway 
between the two. For x? the results are given both with and without correction, since 
experience in other problems has suggested that the correction may not be helpful when 
there are more than two samples. For F the correction was a decided improvement and only 
corrected values are shown.* ; 

It is easy to build up the exact distribution row by row. Members of the first row need not 
be permuted, but all other rows must be. Consider the diphtheria example in Table 2. If 
the sixty-three rows which show either all positives or no positives are omitted, this becomes 
the case c = 4, r = 6; 352. We start with the row (1110) and add successively four rows with 
u,; = 3 and one row with u; = 2. Addition of the second row gives the four cases 


1110 1110 1110 1110 
0111 1011 1101 1110 


1221 2121 2211 2220 


At this stage the possible sets of column totals are (2220) with probability 1/4 and (2211) 
with probability 3/4. All permutations of the third row are now added, and so on. The total 
number of cases is (4*) (6), or 1536, but these combine to give only nine different values of Q. 


7. RESULTS OF THE SMALL-SAMPLE TESTS 


In appraising the tabular y? and F approximations, attention was concentrated on the region 
in which the exact probability lies between 0-2 and 0-005. Table 3 shows the average per- 
centage errors in the estimates of significance probabilities for each of three subdivisions of 
the region. The percentage error is 


100(tabular P —true P)/(true P), 


where the averages are taken without regard to sign. The numbers of overestimates and 
underestimates made by each approximation are also shown. In Table 3, y? denotes the 
uncorrected value, and x’? and F’ denote the values after correction for continuity. 


* Actually, the incomplete beta function rather than F was corrected for continuity, since the former 


was more convenient for reading significance probabilities. Results differ slightly, but not materially, 
from those given by correcting F itself. 


——————— - 
_—_ 


1ad 

l’s 
od. 
ers 
0’s. 
‘ith 
ure 
ed. 
lire 
2 is 
the 


for 
ttle 
or 3 
fQ 
vay 
nce 
hen 
nly 


not 
_ If 
nes 
rith 


11) 
»tal 
fQ. 


‘ion 
er- 
s of 


and 
the 


mer 
ly, 





~ 





W. G. CocHRran 263 


From the percentage errors it appears that x? (uncorrected) and F’ have performed about 
equally well, both being slightly better than y’2. None of the methods is free from bias. x’? 
tends to overestimate and F’ to underestimate. Over the range as a whole x? comes off fairly 
well with 23 overestimates and 32 underestimates, but it appears that a negative bias in 
the region of 0-2 to 0-1 is being counteracted by a positive bias in the region of 0-02 to 0-005. 
For practical use x? is preferable to F’, since it is slightly easier to calculate, though the 
possible application of F’ to more complex tables should be borne in mind. 


Table 3. Average percentage errors in estimating significance probabilities 



































Percentage error No. of overestimates (+) and underestimates (— ) 
| | 
Range of | | x? | x’? F’ 
exact P | ya ae bd | bos U8 aad | 
| + | = + | - + ~ 
| | | 
| | 
0-2 -0-1 15 Iie | Rw Bre 4 6 5 
0-1 0-02 14 seery ORE porn Tee oes ag 5 7 | 19 
0-02-0-005 21 8 oP FBR WP obBn ihe petef- ol 1 bbdo 
| | | 
| ] 
Avorage or total 16 St | 23 32 | 45 10 14 41 
| 














At the true 5 % level, average errors of about 14 % are to be anticipated, which means 
that the tabular approximations might give a value of 0-057 or 0-043 instead of 0-05. At the 
1 % level, the corresponding figures are about 0-012 and 0-008. These results appear close 
enough for routine decisions. For true probabilities below 0-005 all methods tend to go to 
pieces. F’ may give values only one-quarter of the true probability, while the two x? values 
may be six or eight times too high. An exact assessment of a very small probability is rarely 
essential. 

It may be of interest to note the probabilities given by the various approximations for 
the diphtheria example in Table 2. We have already calculated x?, with P = 0-045. The 
exact P is 81/1536, or 0-053, while x’? gives 0-080 and F’ gives 0-062. All methods agree to 
the extent of indicating a probability somewhere close to the region of significance. 


8. FURTHER NOTES ON THE SMALL-SAMPLE CASE 


It has been mentioned that the value of Q, and hence of y?, is unaffected by any rows which 
contain c or 0 successes. This is not so for F, where the degrees of freedom (r— 1) (c— 1) in 
the denominator are obviously increased by the addition of rows of any kind. The value of 
F itself is also affected. Without resort to details, what happens is that if we take a basic 
table containing no rows with c or 0 successes, and add to it an increasing number of such rows, 
the probabilities given by F’ (corrected) or F (uncorrected) increase slowly until ultimately 
they agree with those given by y’? and x? respectively. This implies, incidentally, that at 
intermediate stages F’ may give a better approximation than any of the methods previously 
presented, because for the basic table the probability given by y’? is in general too high and 
that by F’ too low. In the eight worked examples, this was so when half of the rows were 











264 Comparison of percentages in matched samples 


e’s or 0’s. In fact, it might be possible, as a purely empirical device, to set a quota of such 
rows which would be included in calculating F (whether they were actually present or not), 
so as to make F' or F’ a good approximation to the exact probability. This approach was not 
pursued since y? seems good enough for most purposes. The approach may appear slightly 
repugnant logically, but is no more so than the use of an empir:<.4l approximation to an exact 
frequency distribution. 

Some investigation was undertaken in an attempt to discover why, at low values of the 
exact significance probability, x? gives an overestimate of the probability. As might be 
expected, the principal reason seems to be that in small samples the true variance of Q is 
less than that ascribed to it by the x? approximation. The true variance of Q can be obtained 
by the usual rather laborious methods. We find 


E(Q) = (c-1); V(Q) = %e- 1)[ 1-2, 


(8,-- 82)? I 
u,\* 
where & => (=) ; 


U 

The mean value of Q agrees with that of x”, but the variance is always slightly too low. 
These results provide another approximation to the exact distribution of Q, in which instead 
of the x? distribution we use a type III distribution with exactly the same first two moments 
as Q, and with, in general, non-integral degrees of freedom. This approximation was tested 
on the eight examples. It gave a substantial improvement for probabilities less than 0-005, 
but in the region between 0-2 and 0-005 was only slightly better than y?. A similar elaboration 
of F produced about the same results. 

As mentioned previously, the eight examples which were worked lead to a recommendation 
not to use the correction for continuity with y?. This conclusion applies only when there are 
more than two samples. With two samples, the argument for a continuity correction is 
already provided by Yates’s examination of the correction when used with the binomial 
distribution. As a check, two exact distributions with only two columns were worked out, 
and both showed y’ superior to x?, though y’” still tended to overestimate the probabilities. 
A subdivision of the eight worked examples into the four examples with c = 3 and the four 
examples with c = 4 or 5 indicated that the superiority of x? over y’* was slightly greater 
in the latter group. With c = 3, the average percentage errors were 23 for y’* and 18 for x’, 
whereas with c = 4 or 5 the corresponding figures were 27 and 15 respectively. 


9. SUBDIVISION OF yx? INTO COMPONENTS 


In the limiting distribution all totals 7; have the same expectations, variances, and co- 
variances when the null hypothesis holds. This implies that if we divide &(7;—7')? into 
components by the usual rules for subdividing a sum of squares, each component, when 
multiplied by the factor which converts it to Q, will follow a y? distribution in large samples. 
This procedure requires some care in its application. The diphtheria example is not very 
suitable for illustration, since the total x? is barely significant and would probably not be 
subdivided into components. The artificial example in Table 4, with data similar to those in 
the diphtheria example but showing more significance, will be used. 
In the frequency distribution of u,; the rows with.u; = 4 or 0 have been omitted. Since 


X(T, —T')* = (6)? + (15)? + (12)? + (17)? — (50)2/4 = 69, 


we find Q= (4) a) = ae) = 18-81, with 3p.F. 

















such 
10t), 
not 
htly 
xact 


the 
t be 
Q is 


ined 











W. G. CocHran 265 


Suppose that there is some reason to expect that A may perform differently from B, C 
and D. We might then wish to divide Q into the components Av. B, C, Dand Bv.Cv. D. For 
the first component we calculate 

[3(6) — 44]? 
12 


3(56-33) 
ll 


By subtraction from the total Q, 18-81, we find Q, to be 3-45 (2p.¥.). It represents a com- 
parison of the totals of B, C and D. 


= 56-33, Qa = 





= 15:36 (1p.F.). 


Table 4. Artificial example to illustrate subdivision of x? 





Frequency distribution of u 
= 








A B Cc D No. of 
8 
1 1 1 1 4 2 5 
0 1 1 1 6 Lu, = 34, LDuf = 92. 
0 0 0 0 59 cXu,— Du? = 4(34)— 92 = 44. 


























The difficulty is that since Q, is definitely significant, the null hypothesis that the pro- 
bability of success within a row is the same in all four columns can no longer be maintained 
for the development of a comparative test of B, C and D amongst themselves. It seems 
better, when Q, is significant, to recalculate Q,, using only the data from the relevant columns. 
If we reject the first column, the B, C and D totals do not change, but the frequency dis- 
tribution of w, (ignoring 3’s and 0’s) becomes 

Uy f 
2 7 


Du,= 14, Lui = 28, cLu,—Lu} = 14. 


, _ (3) (2) (12°67) 
Q = 14 
The difference between Q, and Q; arises solely in the conversion factor from X(T; —T)? to Q, 
which has been altered from 3/11 to 3/7. This changes the significance probability. from 
0-178 for Q, to 0-066 for Q;. The exact probability, computed from the data for B, C and D 
alone, is found to be 0-078. 

From such cases as I have examined, the ordinary rule for the subdivision of the sum of 
squares, and hence of Q, appears good enough for a preliminary inspection. When the 
situation is similar to that in this example, the advisability of recomputing tests that are of 
special interest should be noted. 


Hence = 543 (2p.F.). 


10. SumMARY 


In this paper the familiar y? test for comparing the percentages of successes in a number of 
independent samples is extended to the situation in which each member of any sample is 
matched in some way with a member of every other sample. This problem has been en- 
countered in the fields of psychology, pharmacology, bacteriology and sample survey design. 











266 Comparison of percentages in matched samples 


A solution has been given by McNemar (1949) and others when there are only two 
samples. 

In the more general case, the data are arranged in a two-way table with r rows and 
¢ columns, in which each column represents a sample and each row a matched group. The 
test criterion proposed is e(e—1) (1,1)? 

Oe) ul)’ 
where 7; is the total number of successes in the jth sample (column) and w, the total number of 
successes in the ith row. If the true probability of success is the same in all samples, the 
limiting distribution of Q, when the number of rows is large, is the x? distribution with (c— 1) 
degrees of freedom. The relation between this test and the ordinary x? test, valid when 
samples are independent, is discussed. 

In small samples the exact distribution of Q can be constructed by regarding the row totals 
as fixed, and by assuming that on the null hypothesis every column is equally likely to obtain 
one of the successes in a row. This exact distribution is worked out for eight examples in 
order to test the accuracy of the x? approximation to the distribution of Q in small samples. 
The number of samples ranged from c = 3 to c = 5. The average error in the estimation of 
a significance probability was about 14% in the neighbourhood of the 5% level and about 
21 % in the neighbourhood of the 1 % level. Correction for continuity did not improve the 
accuracy of the approximation although it is recommended when there are only two samples. 
Another approximation, obtained by scoring each success as ‘1’ and each failure as ‘0’ and 
performing an analysis of variance of the data, was also investigated. The F-test, corrected 
for continuity, performed about as well as the x? approximation (uncorrected), but is slightly 

/ more laborious. 


~ The problem of subdividing x? into components for more detailed tests is briefly discussed. 





In conclusion, my thanks are due to Miss Elizabeth O. Grant and Mrs Elizabeth S. Jamison 
for considerable assistance in the computations. This work was done as part of a contract 
with the Office of Naval Research, U.S. Navy Department. 


REFERENCES 


Cocuran, W. G. (1937). The efficiencies of the binomial series tests of significance of a mean and of 
a correiation coefficient. J. R. Statist. Soc. 100, 69. 

Denton, J. E. & Bercurr, H. K. (1949). New analgesics. J. Amer. Med. Ass. 141, 1051. 

Dixon, W. J. & Moon, A. M. (1946). The statistical sign test. J. Amer. Statist. Ass. 41, 557. 

Hsu, P. L. (1949). The limiting distribution of functions of sample means and application to testing 
hypotheses. Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability, 
p. 359. University of California Press. 

McNeEmar, Q. (1949). Psychological statistics. New York: John Wiley and Sons. 

MOSTELLER, F’. J. (1947). Equality of margins. Amer. Statist. 1, 12. 

Wats, J. E. (1947). Concerning the effect of intraclass correlation on certain significance tests. Ann. 
Math, Statist. 18, 88. 














two 


and 
Che 


rof 
the 
-1) 


hen 


tals 
ain 
3 in 
les. 
1 of 
out 
the 
les. 
und 
ted 
tly 


ed. 


son 
act 


i of 


ing 
ity, 


mn. 











[ 267 ] 


THE EXACT PARTITION OF x? AND ITS APPLICATION TO THE 
PROBLEM OF THE POOLING OF SMALL EXPECTATIONS 


By H. 0. LANCASTER, School of Public Health and Tropical Medicine, Sydney 


1. INTRODUCTION 
When observed and expected frequencies (n in number) are given for a univariate distribu- 
tion, and the expected frequencies are known exactly, the sampling distribution of the 
frequencies is the multinomial distribution. It was shown by Irwin (1949) and Lancaster 
(1949) that x? for the comparison of observed and expected frequencies can in this case be 
split up into (n—1) components (each distributed as x? with one degree of freedom) corre- 
sponding respectively to the difference between the first and second frequencies, between 
the first and second pooled and the third, and so on seriatim. It is therefore obvious that 
pooling k frequencies with the remainder removes exactly k companents of x, and that the 
pooled distribution may be tested by y? with (n —k—1) degrees of freedom. If the expected 
frequencies are estimated from the data, the reduction of the number of degrees of freedom 
by & is not an exact allowance for pooling. This is clear from the case of the binomial with 
index 2, where after a single pooling there are no degrees of freedom left, although x? has 
not been reduced to zero; but the error is clearly small if the observed frequencies are small. 

The author’s earlier work and that of Irwin (1949) showed that a contingency table can 
be split up into component fourfold tables in various ways in such a manner that the x°’s 
of the individual tables (with one degree of freedom each), if properly calculated, have a 
sum equal to the y? of the whole table. A possible way of dealing with one or more con- 
secutive small frequencies at the ends of the rows or columns of a contingency table is to 
pool them with a next neighbouring frequency. If only one frequency is pooled it might be 
combined with its neighbour either in the same row or the same column. The corresponding 
expectations are also combined and ¥y? is then calculated in the usual manner, the number of 
degrees of freedom being reduced by the number of poolings. 

Consideration of a fourfold table shows immediately that this is not an exact process. For 
if the two frequencies in one row are pooled (the sum is necessarily equal to its expectation), 
the two frequencies in the other row—and their expectations—are left, but the number of 
degrees of freedom is now zero. 

Let us suppose that say three frequencies a@,, a1, a3 in a contingency table are pooled 
with a,,. An exact method of making the allowance for pooling is to subdivide the table 
into its component fourfolds, to pick out the three components in which comparisons between 
11, 449, 443 and a,, occur and remove the values of x? for these components from the total. 
The subdivision of the t :al x? into components is not unique, but only one of the possible 
subdivisions is relevant so that the procedure to be adopted is unique if the following con- 
ditions are observed in pooling: 

(i) The frequencies of the cells to be pooled are to be used separately only to compute the 
row and column totals of the original table—a necessary step in determining the expectations. 

(ii) In any of the component y*’s not eliminated by pooling, the observed or expected fre- 
quencies of the cells pooled must appear always summed and not singly or with different signs. 

The process is instructive in itself, as we might have other reasons for making the sub- 
division, and it throws light on the error likely to be made by the approximate process 
suggested above. 











268 Exact partition of x? 


2. THE PARTITION OF x? IN (r x s) MULTIPLE CONTINGENCY TABLES 


Let p,; be the probability of an observation falling into the class in the ith row and jth column 
where i = 1,2,...,r,4 = 1,2,...,¢ and a,; be the observed frequency in this class. 


& rT 
Let Pi. = 2 Pw Pj = DP (1) 
and similarly for the a,,, Further, 
UP. = UP.s ™ 2Pis =1, (2) 
UG, = 24, = Day =a. (3) 
a j i, 
k 1 lk 
Let Ry = UL%Uyp Cy= Udy Th = DX dX aj. (4) 
j=1 i=1 i=1j=1 


It will be convenient to write the standardized variables with the same letters as the 
observed frequencies but barred. Thus 


Oy = {a,,— E(a,;)} J[E(a,;)). (5) 

This notation will also be used for R,,, C,;, T;;. 
The partitions of x? derived in terms of orthogonal transformations of the standardized 
variables @,, (Lancaster, 1949) can be written in terms of the squares of the standardized 


variables. I do this for the case of the 2 x s table and extend the treatment to the r x s table. 
In a 2x 2 table 


x? = p> aj;. (6) 
In a 2x 3 table, the exact partition gives 
x? = xi+xb (7) 


But x3 is the y? of a fourfold table: 
Riz As | a. 
Rog ag | Ae, 


Tx 4s | a 
so that x3 = Ri, + Ri, + Gi, + Gi, (8) 
and then Xi = Gt, + Gi, — Ri, + a}, + 3, — Ri. (9) 


Similar reasoning applied to the 2 x 4 table gives 
XG = Rin + Ri, + Gig + Ty, 


xi = Rt, + a2, — Rt, + Re, + a2, — R3,, (10) 


Xi = G3,+ ai, — Ri, + Gi, + a3, — Ri, 
and the extension to the 2 x s tables is obvious. The results obtained will be found sufficient 


to give the correction for pooling in 3 x 3 tables and so in the general contingency table. We 
may, for example, tabulate the four component x? of the 3 x 3 tables thus: 





Xi = Gj, +a}, — Ri, Xi = Gi, + Gi, — Ch, 
+}, + ai, — Ri, + Ri, + RR, — Th. 
— Ci, -C+ Th (11) 





x§ = 43, + G3, — Ri, Xi = 43, + Ci, + Fi, 
+03, +Ci,-Ti, +Th, 





























mn 


(1) 
(2) 
(3) 


(4) 


the 
(5) 


zed 
zed 
ble. 


(6) 


(7) 


(8) 
(9) 


10) 


ent 
We 


11) 


—— 











H. O. LANCASTER 269 


A definite example will now be considered. A (3 x 3) table has been formed in the course of 
a random sampling experiment by a set of fifty drawings from a table of random sampling 
numbers. The theoretical probabilities are: 


Pi., Pe, Ps. = 3, é: $ 
P.wv P.2 P.s=t, é: g 


The observed and expected frequencies, using values of p; and p_; estimated from the 
data, may be tabulated as follows: 








Observed values Expected values 
3 4 15 22 3-08 7-92 11-00 22-00 
2 8 5 15 2-10 5-40 7-50 15-00 
2 6 5 13 1-82 4-68 6-50 13-00 
7 18 25 50 7-00 18-00 25-00 50-00 























The standardized variables @,, and also the Helmert matrices which give an exact partition 
of x# are: r—0-045584 —1-392912 1.206045 
Q = (@,;) = | —0-069007 _1-118862 —0-912871], (12) 
| §60°133424 0-610170 —0-588348 
'0'663325 0-547723 0-509902) 
R = | 0-636715 —0-771100 0 . (13) 
|0-393185  0-324662 —0-860233_ 
'0-374166 0-600000 0-707107) 





C =| 0-848528 —0-529150 0 } (14) 
peepee 0-600000 —0-707107] 
ro 0 0 00 0 
RQC’ =|0 0-946346 —2-081471]=]0 yx, xl. (15) 





10 0243722 —0-967239 0 Xs X 

x? for the contingency table is 6-22304, which is the sum of the four components of (15). 
From the general form of RQC’ it may be seen that the comparison of a3, with a3, occurs in 
only one of the y’s, namely, x3. In the computation of every other y, a3, and a32, when they 
appear, are added. 


x3 is the x? of a fourfold table Con Cea | Tes 


31 32 | Rego 
@, Gs | T 39 
with the expected frequencies calculated from the margins of the (3 x 3) table, by means of 
formula (12) of Irwin’s paper (Irwin, 1949), But from (11) 
xg = 02, + C2, — 13, +03, + 3, — FR}, (16) 
Pooling G5, and G3, removes a quantity (@3, + @, — R2,), which is less than the amount removed 
by the single component x3, by C3, + C3, — 73,. It is easy to show that 
(Ci, + Cie — T32)/x3 = Ps. 











270 Exact partition of x” 


so that the error in assuming @3, + @3, — R32, corresponds with one degree of freedom is small 
if p, is small. In the example the pooling of a,, and a3. removes an amount 


(0-18)2/1-82 + (1-32)2/4-68 — (1-50)2/6-50 = 0-043956, 
which is less than v2 = 0-059400. For 
x% = (0-18)?/1-82 + (1-32)2/4-68 — (1-50)?/6-50 + (0-18)2/5-18 + (1-32)2/13-32 — (1-50)2/18-50. 


The sum of the three last terms is relatively small. In a similar way the effect of pooling 
several frequencies in a contingency table may be compared with the effect of removing all 
the x* components, corresponding to a given method of subdivision, in which comparisons 
are made between them. 

The work of Sukhatme (1938) on the distribution of y? in samples drawn from a Poisson 
population may well prove to be applicable to the case of the (r x s) contingency table with 
many small expectations. His work shows that for a sample of n = 10 or more from a Poisson 
distribution—or what is the same thing for a sample from a multinomial distribution with 
ten cells and equal expectations in each cell—the distribution of y? agrees well with that 
based cn normal theory provided the expectation for each cell is two or more. In an (r x 8) 
contingency table the cell frequencies may be regarded as being drawn from a series of 
independent Poisson populations with means a; a_,/a subject to the restriction that the 
marginal totals are constant. Hence one would expect that for a; a ,/a>2and more than say 
10 degrees of freedom, the distribution of x? for the whole table would agree well with that 
based on normal theory. However, a random sampling experiment to test this is desirable. 


I should like to thank Dr J. O. Irwin for advice and assistance in drafting the paper. 


REFERENCES 


Irwin, J. O. (1949). A note on the subdivision of y? into components. Biometrika, 36, 130. 

LancaSsTER, H. O. (1949). The derivation and partition of x* in certain discrete distributions. Bio- 
metrika, 36, 117. 

SuxHatmgE, P. V. (1938). On the distribution of x? in samples from a Poisson series. J. R. Statist. Soc. 
Suppl. 5, 75. 


nall 


[ 271 ] 


THE USE OF RANGE IN ANALYSIS OF VARIANCE 


By H. O. H sRTLEY 


1. INTRODUCTION 


Whilst the range is widely used in industrial quality control, its application to the analysis 
of experimental data has been more restricted and the statistically more efficient, but com- 
putationally more cumbersome sample variance is usually preferred. The reason for this 
is partly that the well-known analysis of variance was first developed in connexion with 
experiments in which the computational labour of analysis often represented only a small 
fraction of the labour of experimentation, so that no effort was spared to obtain the maximum 
amount of information in the analysis of the valuable experimental results. Now that the 
analysis of variance is more widely ured in operational and industrial research, situations 
frequently arise in which the ‘data are cheap’ and the time available for analysis often 
limited. It is fitting therefore to inquire whether a short-cut method of analysis based on 
range could not be developed. 

The use of range or mean range as an estimator of the standard deviation of a normal 
parent has for long been common practice. Recently (Lord, 1950; Patnaik, 1950; Pearson, 
1950), results were obtained on the exact distribution of mean range, on its approximation by 
the y-distribution, on the efficiency and power of test criteria in which mean range replaces 
the sample variance and, finally, on the effect (on such criteria) of deviations from normality 
and of other heterogeneities in parent or sample. Cox (1949) has developed a sequential test 
for « based on range. 

However, when applying range to the analysis of experimental designs, new problems in 
both distribution and computational procedure arise, some of which we are going to examine 
in this paper. It will be seen that at the expense of a small loss of statistical efficiency, the 
labour of analysis can be appreciably reduced and brought within easy reach of experi- 
menters not equipped with calculating aids beyond a slide rule. Finally, the present pro- 
cedure provides for a current control of the homogeneity of variance without adding to the 
labour of the main analysis. 


2. NUMERICAL ILLUSTRATION OF THE RANDOMIZED BLOCK ANALYSIS BY RANGE 


Whilst the analysis of data grouped by a single classification (between-group, within-group 
analysis) follows on the lines of the t-test based on range (Patnaik, 1950), the double classi- 
fication already shows the characteristic features of the higher designs, and we shall discuss 
its analysis in detail. Before proceeding with the theoretical basis (§3) we will describe the 
procedure in terms of an example given by Snedecor (1937, p. 269) on the yields of four strains 
of wheat planted in five randomized blocks; this is set out in Table 1 with the analysis of 
variance given in Table 2. , 

With the range method we proceed as follows. In order to estimate the error standard 
deviation o,, we first form differences of individual yields from their respective strain means 
(strain residuals) and then form the ranges of these residuals for each of the five blocks. This 
work is shown in Table 3. 











272 Use of range in analysis of variance 


The estimate of the error standard deviation is now obtained immediately from the mean 
range by the simple equation $y, = 14:3/(5 x 1-88) = 1°52. (1) 


Here the divisor, 1-88 (called the scale factor c), is obtained by entering Table 5 (§ 3-3) with 
k = no. of blocks = 5 and n = no. of treatments = 4. The resulting s,, = 1-52 is close to the 
estimated s = 1-49 from the analysis of variance (Table 2). 

Next to cin Table 5 are shown the ‘equivalent degrees of freedom’, v = 10-9, on which our 
estimate s,, is based. This may be compared with the error degrees of freedom, 12, in Table 2, 
indicating a loss of 1-1 degrees of freedom with our approximate method. In order to test 
the significance of treatment differences, we again replace the computation of the treatment 


Table 1. Yields of four strains of wheat in pounds per plot 











Strain 
Block 
A B Cc D 
1 32-3 33-3 30-8 29-3 
2 34-0 33-0 34:3 26-0 
3 34:3 36°3 35°3 29-8 
4 35-0 36:8 32:3 28-0 
5 36°5 34-5- 35°8 28-8 
Mean 34:4 34:8 33°7 28-4 























Table 2. Analysis of variance ; summary of data in Table 1 





























Sum of squares D.F. Mean square 
Blocks 21-46 4 5-36 
Strains 134-45 3 44-82 
Error 26-26 12 2-19 
Total 182-17 19 
Error s = 1-49. 


Table 3. Strain residuals and their block ranges 

















Block A B Cc D Range 
1 —21 —1+5 —2-9 0-9 38 
2 —0-4 —1:8 0-6 —2-4 3-0 
3 —0-1 1-5 1-6 1-4 1-7 
4 0-6 2-0 —1-4 —0-4 3-4 
5 2-1 —0-3 2-1 0-4 2-4 
Total | 14:3 
| 




















mn he tt ea “eh tt 





> mean 


(1) 
3) with 
to the 


ich our 
‘able 2, 
to test 
itment 


H. O. Hartiey 273 


mean square by the range of the treatment means which (from Table 1) is 34-8 — 28-4 = 6-4; 
then the ratio q = (6-4 x ,/5)/1-52 = 9-4 can be immediately referred to Table 6 (§ 3-4)* for 
a test of significance. Entering this table with n = no. of treatments = 4 and v = 10-9, we 
obtain for the 1 % point of q the value 5-5, which is clearly exceeded by the observed value 
of 9-4, indicating significant treatment differences as is found with the analysis of variance 
test (see Snedecor, 1937, p. 269). 

If a test for block differences is desired, we would have to form the means of the five blocks 
which (from Table 1) are found to be 31-4, 31-8, 33-9, 33-0 and 33-9 respectively. The range 
of these is 33-9— 31-4 = 2-5, so that q = (2-5 x ,/4)/1-52 = 3-3. Entering Table 6 (§ 3-4) with 
n = 5 and v = 10-9, we find that the 5% point is 4-6, so that there are no significant block 
differences. 


3. THE DISTRIBUTION THEORY FOR THE RANDOMIZED BLOCK ANALYSIS BY RANGE 


3-1. Mean and variance of mean range 


We denote by 2, the above yields from n = 4strains (¢ = 1, 2,3, 4) grownink = 5 randomized 
blocks (¢ = 1, 2,3, 4,5) and adopt the usual probability model 


yy = +, + 2%, (2) 
where the ‘error variates z,,;’ are a random sample from a normal population with mean 0 
and variance o?, the a, are the treatment population means and the 6; are random block 
variates. 
We now form the residuals 2,;—%, of Table 3, and since 


ay —%, = b,-b+2,-%., (3) 


we note that for a fixed block (say the ith block) the range w; of the n values of 2; —%,, is 
equal to the range of the m independent normal deviates z,; —Z, , with zero mean and standard 
deviation o(1—1/k)*. We therefore have 


&(w;,) = &(@) = J(1-1/k)d,o, (4) 


where d,, is the familiar population mean of the distribution of range in samples of n from a 
normal parent with zero mean and unit s.D., which is tabulated (see, for example, Pearson, K., 
1931, Table XXII) and w = k-! Yw;. However, the block ranges w, of the residuals 2; — %,, 


are not independent, so that the distribution results derived for the mean W of independent 
sample ranges (Lord, 1950; Patnaik, 1950) do not apply. On the other hand, the useful y- 
approximation which Patnaik employedt only requires a knowledge of the mean and variance 
of #, and if we set out to obtain a similar approximation we may use the mean given by (4), 
whilst the variance must be obtained from 


var (3) = kV, 0%(1—1/k) (1 + (k—1) Py), (5) 


where V, is the variance of range in samples of n from a normal population with unit s.D., 
and p,, is the correlation between any two of the block ranges. With the /V,, known and 
tabulated (Pearson, E. S., 1932), our task consists in deriving the p,,. 


* Abridged form of the table of percentage points of the ‘Studentized range’ q (Pearson & Hartley, 
1943). ’ 


+ The distribution of the criteria which we consider does not depend on whether the 6b are regarded 
as population parameters or sample values. 
t It is hoped to examine the accuracy of this approximation in more detail in the near future. 


Biometrika 37 18 








274 Use of range in analysis of variance 
Denote the residuals for two particular blocks 7, and i, (say) by 


f= %,-%., N= %-%.; (6) 
and note that for any pair &,, 4, belonging to the same treatment ¢ 
corr (£;%) = —1/(k-—1),  varé, = vary, = (k—1)/k, (7) 


whilst any &;, 7, belonging to different t are independent. The joint distribution of a pair of 
normal deviates is therefore given by 


k \t 1/k-1 
h(E) = (270) (.=s) exp{-3(7=5)(@ + 0 + 2E)/(k - 1) (8) 
and the joint distribution of all &,, 4, by the product of the A functions over all t. 


3-2. The correlation between ranges in pairwise correlated samples 


We proceed to find the joint probability P(w,w) for the range of the n values of £, to be 
<w and, simultaneously, for the range of the n values of 9, to be <w. We find 


























P(w,w) = A(w,w) + B(w, 0), (9) 
+00 E+w Py+w n—-1 
where A(w,) = nf f°" agayig.m([ "|" me's") a" dy’) (10) 
-—o ” 
: | 
a @ ¢.7) 7 
‘ | , 
ba wW 
© e 
° C ) 
2 
© én) w en) 
Configuration contributing to A(w, w) Configuration contributing to B(w,w) 
Fig. 1 


denotes the chance of configurations £,7,in which the smallest £, and 9, correspond to the same 
treatment t, whilst 


+0 é+-w ’ n+w - , ‘ E+ Py+o Rit n—2 
Bee, w) = mon) [[" aan fag |” ay mesa) me'sm (f° [nce 9" ae a) 
—% v] ” 

(11) 
represents the chance of configurations in which the minimum £, and 7, correspond to different 
treatments. Examples of configurations £,, 7, contributing to the integrals A and B are given 
for n = 6 in Fig. 1. In both cases, A and B, the rectangle with vertices 

&9; §+u,9; &9+0; E+, 9+o 


is the range of integration for the binormal surface h(£”, 7”). If we exclude any configurations 
in which any of the £, and/or , coincide exactly,* the configurations of types A and B are 
mutually exclusive and exhaustive. 


* These configurations have a zero chance. 





H. O. Hartiey 275 


We now express the product-moment coefficient of w and w (},, say) in terms of P(w, w) 
and find by double partial integration 


pin = [° [1 Pleo, 0) — Pw, c0) + Pw, 0) drode, (12) 
oJ0 
which may be written as 
ww 
4, = lim (ws—20w—w,)—[ | P(w,0) dodo), (13) 
Ww>o 0/0 


where in (13) W, = d,(1—1/k). 
We now evaluate the integral in (13) and write with the help of (9) 


Ww pw Ww rw Ww rw 
-{ I P(w, wo) dwdw -{ | Aw, w) dodo + { } B(w, w) dwdw 
0 Jo 0J0 0 Jo 


= Ij +Ip, say. (14) 
Starting with Jp, using (11) and writing 0 = £+w,7 = 4+, we have 


Ww w +0 J 
= i dof aw dO dr h(O—w, 9’) dy’ 
0 0 — 0 To 


n—2 


x(n—)f me'r—-wyae(" aat[” Mer aaet), (8) 


and by partial integration with regard to w, 


w +0 T T 6 n—1 
In =nf do] aoar| iO —w,n')da'({ ay"| mg", 9") dé") 
0 -—o r-W TW . 6-—w 
Ww pw +00 1 @ n-1 
-nf dio] aw dO drh(6 —w,7T—w) (| an" | Mg", 1") a6") . (16) 
/0 0 —o Tw 6-—w 


The second term in (16) cancels against I,, and in the first we have an integral with regard 


to w, so that we obtain 
r= {fr a0 dr ( ({, st h(&", 9”) dé" dy)" : (17) 
o-Ww 


Finally, converting the covariance into the correlation coefficient p,, between w and w, 
we obtain from (13) 


poner} am far ff gsénaen rm}, 0 


where W, = d,,(1—1/k) and d,, and V,, are mean and variance of the range in normal samples 
of n. By changing the scales of w, 0 and 7 in the ratio (k—1)/k we may write (18) as 


‘ 
pe = Verio {f""aedr([[”  ténagan)"—-c—a,y}, (09) 
where eGo) = (p97 (2m) exp | ~~ (+ 9° 208 (20) 


is the binormal surface with ‘correlation p = —1/(k~ 1), whose integral has been tabulated 
by K. Pearson (1931, Table IX). It is clear from (19) that & enters into the formula for p,, only 
through the binormal surface z,(£,7), which depends on p = —1/(k—1). In Table 4, p,,(n, p) 
is tabulated for p = —0-0(0-1) —0-5, —1 and n = 2(1)9. The values for p = —0-2 were 











276 Use of range in analysis of variance 


computed by numerical quadrature, and the remaining values were obtained by fractional 
power interpolation between the values at p = —0-2, p = 0 (when p,, = 0) and p=—1 
(when p,,= 1). It is hoped, however, to improve the accuracy of Table 4 by additional 
numerical quadratures. For completeness, values of ,/V,, are also reproduced in Table 4. 


Table 4. Correlation p,, between ranges in pairwise correlated normal samples of size n ; 
standard deviation ./V,, of range in normal samples of size n 



































Correlation Sample size, n 

between 

sample 

pairs p 2 3 4 5 6 7 8 9 
—1-0 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 
—0-5 0-25 0-28 0-27 0-26 0-24 0-22 0-19 0-15 
—0-4 0-17 0-18 0-18 0-16 0-15 0-13 0-11 0-08 
—0-3 0-10 0-11 0-10 | 0-09 0-08 0-07 «| «0-05 0-04 Pw 
— 0-2 0-044 0-051 0-046 | 0-042 0-035 0-029 | 0-021 0-013 
—0-1 0-011 0-014 0-012 0-011 0-008 0-006 0-004 0-002 

0-0 0 0 0 0 0 0 0 0 
VV a 0-852 0-888 0-880 | 0-864 0-848 0-833 | 0-820 0-808 
| 














3:3. The approximate x-distribution of mean range 

Previous work (Patnaik, 1950) has suggested that the distribution of the mean of indepen- 
dent ranges is closely approximated by that of cy/,/v, with x based on v degrees of freedom. 
Using a similar approximation for our mean @ of correlated ranges and equating &(w) and 
var (#) to the corresponding moments of cy/,/v, we obtain ‘scale factors’ c and ‘equivalent 
degrees of freedom’ v as shown in Table 5. With the help of this table we compute, as an 
estimate of o,, the ratio w/c, using the equivalent degrees of freedom v for any test of signi- 
ficance for which this estimate is used. It is hoped to investigate the accuracy of the y- 
approximation in future work. Table 5 corresponds to, but differs from, Patnaik’s Table 1. 
For ko the entries in both tables converge to the same limits, i.e. c>d,,, v->00. 


Table 5. Scale factors c and equivalent degrees of freedom v for 
double classification into k blocks and n treatments 





















































\n 3 4 5 6 7 8 | 9 | 
| 
k | briss eid ny tm 
v c v c v c v Cc v c v c | v c | 
| | | 
2| 19/135] 30] 1-58| 38|1-75| 47] 1-89] 55] 200] 63] 210) 7-0 | 218 | 
3 | 37/1-48| 56| 1-76! 7-4| 1-96] 9-3 | 2-12 | 11-3 | 2-26 | 13-4 | 2-37 | 15-7 | 2-46 
4| 54/154] 82] 1-84 | 11-0| 2-06 | 13-9 | 2-23 | 16-9 | 2-38 | 20-1 | 2-50 | 23-6 | 2-60 | 
5 | 7-2| 1-57] 10-9 | 1-88 | 14-6 | 2-12 | 18-5 | 2-30 | 29-4 | 2-45 | 26-6 | 2-57 | 31-1 | 2-68 | 
6 | 89 | 1-59 | 13-6 | 1-91 | 182 | 2-15 | 23-0 | 2-34 | 27-9 | 2-49 | 33-0 | 2-62 | 38-3 | 2-73 | 
7 | 10-7 | 1-61 | 16-3 | 1-93 | 21-8 | 2-18 | 27-6 | 2-37 | 33-3 | 2-52 | 39-3 | 2-65 | 45-4 | 2-76 | 
8 | 125 | 1-62 | 19-0 | 1-95 | 25-4 | 2-20 | 32-1 | 2-39 | 387 | 2-55 | 45-6 | 2-68 | 52-5 | 2-79 | 
9 | 14-3 | 1-63 | 21-7 | 1-96 | 29-0 | 2-21 | 36-6 | 2-41 | 44-0 | 2-57 | 51-8 | 2-70 | 59-6 | 2-81 | 
10 | 16-1 | 1-63 | 24-4 | 1-97 | 32:6 2-22 | 41-0 | 2-42 | 49-3 | 2-58 | 57-9 | 2-71 | 66-6 | 2-83 | 
} | ! i 








> = r= @ LJ 


Pw 


en - 
om. 
and 
lent 
3 an 
gni- 
e Y- 
le 1, 


H. O. HartLey 277 


3-4. The ‘Studentized range’ test as a substitute for the variance ratio F-test 
The criterion used in § 2 for testing the significance of treatment differences was the ratio 
q = kx range of treatment means/(#/c), (21) 
and a similar criterion was used for the comparison of block means. This ratio was referred 
to tables of the ‘Studentized range’, which is the ratio of a range w in a sample of n indepen- 
dent normal variates divided by an independent estimate of 7, based on v degrees of freedom. 
Even if we accept that w/c is distributed as s, we still have to show that it is independent of 
the ranges of treatment means and block means. 
Consider the following linear transformation of the nk independent normal deviates z,,;: 


Uy = %.—2 (¢= 2, 3, ..09%); 
v,=2;,-2 += 2,3,...,k), 
Zt t ( ) (22) 
Yu = 24 — 4-242 (¢ = 2, 3, coey 5 a = 2, on | 
Z =2. 


We then have from the randomized block algebra: 
ts = kX ui + (Lu)*] + LX vo} + (Le)?] + = Yist x (2 Yi)? + u(% Yui) + ~ Yi) +nkZ*, (23) 


It follows that the joint distribution of the new variates is a product of four functions de- 
pending respectively on the w’s, v’s, y’s and Z, and clearly the range of the treatment means, 
range of block means, and ranges of residuals z,,—Z, can be defined in terms of the w’s, v’s 
and y’s respectively, so that they are independently distributed. 

In referring the ratio q to tables of the ‘Studentized range’, no further approximation 
beyond that employed for w is involved, and it follows from a general theorem (Hartley, 
1948) that the maximum error in the approximation to the probability integral of q does not 
exceed that for w. 

The small loss in efficiency in using @ in place of s can be roughly gauged by comparing 
the degrees of freedom v in Table 5 with the ‘error p.¥.’ of the corresponding full analysis of 
variance, viz. (n — 1) (k—1). A similar loss of efficiency is incurred by replacing the treatment 
mean square by the range of the treatment means which is, again, approximately distributed 
as cx/./v (Patnaik, 1950) with equivalent degrees of freedom tabulated in Patnaik’s Table 1 
(p. 80, column m = 1). 

Although these comparisons strongly suggest that the use of g in place of F results in a 
small loss in power, it is hoped to confirm this by computations of power curves. In the case 
of a ‘random set up’ - or the alternative hypothesis (Johnson, 1948), this could obviously be 
done with the help of the tables of the probability integral of the ‘Studentized range’ 
(Pearson & Hartley, 1943). The tables of the percentage points which are required for the 
test of the null hypothesis are reproduced below in abridged form (Table 6). 


4. CURRENT CONTROL OF THE HOMOGENEITY OF VARIANCE 


The normal analysis of variance procedure of a randomized block experiment provides a 
single error variance based on all residuals. Although separate residual variances s?, 8} could 
be computed separately for individual blocks or treatments, such a procedure would be 
computationally laborious and, in any case, the individual s? or s? could not be entered in 
the usual test for heterogeneity (L, test or Bartlett’s test) which assumes independent s*. 








W 
po 
Al 
co 
th 
wi 
ex 
o? 
Wi) 
al 
ne 
al 
tl 
m 
a) 
a 
k 
a 
(2 
fc 
n 
Cc 
h 
oO 
a 
d 
c 
( 
v 
f 
g 


| 
| 









































































































































g eyeas tSeSeee BEBE ATS g t7S0e eeerr SEER SSS 
SOEOSSSD SCOT HH WwYWOH WwW Peer d SHSSSS SCHSSS OS©HYH se 
DOOM COL = ODO Om go 
2 YONMK SSSSH HFSHHR AOS z TSSOSS DHOELS SYS SES | ER 
” SOSSS SCOHHH HHHH wBWOHvs Peres SSSSS SCOHDSS HHH 2 
r=] 
Doo OWN Oman woe ® 
® YOANMS COVHDE OVOYA BOS 2 299220 Orrse SYA SF Sg 
= SOSSSS OGHH HH VHHOoHH Over rroeoed SOSSSS SCHSS HHH fa 
a MOOD AOD BSR aan &§ 
; = HAASS PHOS SCYSNQ aSe 2 SSSRDHD Froeos YVRNS SE 
| | % 2 SOSSS BWHHHH HHH Ove rSosoed SHSSS SCHSHD HHH 23 
8 | -nNOor oot DADA HOD g 3 
4 AtSO® 2OEES HVS CEO 4 Soom FSSKS Ome eeF ‘a 
I = SSSOH VHH OH OHHH Oates rMSoosod SHSSS SHHSS HDH mw % 
° 
=” oom OF9°9 COND BD. oe 
» ” BEOR Sa" EEOSe BNae Fox 3 Soe OOVoy FASS wot | {OS 
> = SSSHH WH HHH VBHHH VWs SSSSH SCHDSDHS SCHSOSH BOHVWH 2.2 
) a8 
AQroO Wet tO 2 
= S t | soser reene FRIIS SSE t | gerne oonyy 3838 EBS & 
3 °3 i © O 19 1 1D 1 19 19 10 19 19 19 1D HH =) onooesd ooo ID 10 % 
‘> YQ 
SS 8 HNSOD WN rwonm a So 
3 3 © Seee~ Sears Sse Se © Se-ee Beate C2 eo 2 Se & 
= § C1HHOH VWHHOH BwHwotH Hest SSOSSS SDHSSS SHHH WHH as 
‘= é OHI aN EAD SYS 2 
w = Ss Gree CFOS EASe FES 3 ereeo wymea RSSE SHAR | Ze 
2 RM ba HHHDH VWHHHOH 0WwHHdt Hes CSSSS SHSHSS CHSHH BHH ae 
x: ie 2 
= o = O19 =H a cl 00 oD 0 E> 9 
> S| 
SS Ss = oeohe Feana soem moe a= | be SOV SSSR mar coe 
8 “ ts) 19191911 0D:10H 1H 1] 10 Moat Ws dH 2 oeoos Oveoeso O©CnNHwHH WWW + 
= 
Ss oR omot+t OOr te2e zee Re 
s -> S —Sowe. entrar  SCoOar oS 4 4 SYVoVs MAGS eres ves oeq 
2 = ° HDHHH HHYOHH VWOtdtH Hed ° SSOSSS SCESOSSS OCHO H COHYH ao 
my 4 10 ol 
y Hoot OTD MOOD COND gS 
s o SOYRA AtISS SHES OWS o WYGPOA AstISS SPHGHH BAS z 
> » HHHH Hy HYWH OH Wests Wed SSSSS SCHSSS VWHVH wBOwVvYH £5 
~~ 
s = SOnN HOD Sere gaze § 
= 4) YON S$OSGSH 2ESw Woa a) VOQms GOeee “7. *. oi = 
‘> S HHHHH HUOHHH Weds Wed SSESD SOVH SH BWHVYH WHA 9.8 
i] 
© LS wore Ate S2YSR 228 2% 
H Q, BS Hart SO® OOOEE SHOWS OAs iS Sao OOOre -e2 Tc. wee $0 
S . OHwd wot HHHtet Wedd Wad CSSOD 6WHHHH VVOHH Hw Be 
S CHDOM COM a2H5 e@52 a” 
CODDMrK OCONMNH WHBN BAS HOODD EFErHOOSH HMA DOr 
3 on = t+ > ats. & Se) a eS | fea Stk. oS —— =. + ao od 
dS OHHH HHedaest Heda Wed SOOHDH BHVNwWHH BH HH Hed £8 
> 
MOH WDAO S582 xzx2e ca 
os wn S2oony “Vyauws Anse OFCo wn eeor-ee BY YN Seo” oN £. 
> HHH Het BPHedad MMM WHHDHHD BWBHH OHH VHHYOH Bad oe 
- 
3S rFOWD 412M Amana BOO k > 
cS + VTA Ma oeoo SeOer Foy + STH RAs CESee “ST ws 
= HHHAHH HHH BIMM DHMH HHH OH VWHHDOH VHedte Bad a3 
OANaH CON wWHOr BON e 
t.) SOs FeSee Yery Tc. i] SOPs  COmrPe er re Tae a § 
cw «ODD MD DMD DD SHOHDHH HHH Pega Bad -3¢ 
MH OOM BMOMWrFSD WANDS MOr DANDH CFMOrH NAOQN OO S48 
[e-0) a =sBMOSS S$OSES2 S900 COTr a FRSAR Minoo COSCO he z 3 
pa CdD AMANAN AAANQ ANN HHH Pegqad Amon mmo ep 
N De) 
SS Sant WLSEDl SASS SR = Snagmy wsre2c Sxoc SRS 
pee | a8 SSeS Sete AN as 
ms a _ 

















waeis 
9 | 20 
t | 65 
3 | 6-4 
2 | 63 
1 | 6-2 
1 | 61 
0 | 61 
0 160 
9 16-0 
9 159 
g | 5:8 
15 | 5-80 
50 | 5-65| 
16 | 5-50! 
33 | 5-37! 
20 | 5-24| 
09 | 5:13 
97 | 5-01 
9 | 20 











H. O. Hartitey 279 


With the present procedure, individual block ranges w, are available for a test, and it is 
possible to use the ratio (range of w,)/# = w,,/W as an expedient criterion of heterogeneity. 
Although its efficiency and power still require scrutiny, this criterion is extremely simple to 
compute, and we are here outlining an approach to its approximate null distribution. For 
this we require the following lemma: 


Lemma. Let ,,..., 2, denote a multivariate observation from a multinormal distribution 
with equal variances o? and equal correlations p > — 1/(k— 1); then the range w of the z; is 
exactly distributed as the range in a sample of k independent normal variates with variance 
o*(1—p), and, further, is distributed independently of the mean %. 

Proof. The linear transforms 


k 
u, = U1 = py(axe+ > xi), (24) 
i=l 
with a = —[p-+(k—1)]+[(o*+k-—1)(p-*— 1)}}, (25) 


also follow a multinormal distribution and, since obviously cov (u;u,;) = 0, are independent 
normal variates for which it will be found that var (u) = o?. Since further 


w = range of x; = (1—p)* range of u; 
and ¥ = (a+k)a-"(1—p)-*u, 


the lemma follows from the fact that the range of independent normal variates u; and their 
mean % are independently distributed. 

In order to apply the lemma to the block ranges w, in place of the x;, we would have to 
approximate to the joint distribution of the w; by a multinormal surface. This is known to be 
@ poor approximation, as the marginal distributions (viz. the distributions of range w) are 
known to be nearer to skew x-distributions. A better approximation appears to result by 
applying the Wilson-Hilferty (1931) transformation to the ranges w; and regarding the 
(w,)# as multinormally distributed. This transformation to normality is certainly effective 
for the marginal distributions of ranges w; However, proceeding with the poorer approxi- 
mation and regarding the w; as multinormally distributed with variances o2V,(1—1/k) and 
correlations pol - ma) given by (19) and tabulated in Table 4, we apply the lemma and 
have the following results: 

The range, w,,, of the block ranges w,; is approximately distributed as the range in samples 
of n from a normal parent with variance o2V,(1—1/k)(1—p,,). The mean, ®, of the w; is 
approximately independent of w,,, and #%/c serves as an estimate of o,, being approximately 
distributed as x/,/v based on v degrees of freedom. It follows that the ratio 


tw = We] (2) W¥a{(1-Z)0-re)} (25) 


can be referred to tables of ‘Studentized range’, so that we may enter Table 6 with 
(i) m = number of w; = kand (ii) v degrees of freedom. 

Applying the criterion. to our example in § 2, we find: n = 4, = 5 and from Table 3, 
Wy = 2-1; 1-—1/k=4; from equation (1), W/c = 1-52 and from Table 4, ./V,= 0-880 and 
P»(4, 0°25) = 0-06; hence we have from equation (25), g,, = 1-8. Entering Table 6 for n = 5 
and v = 10-9 we find that our q,, is clearly insignificant, so that no heterogeneity is indicated. 











280 Use of range in analysis of variance 


5. GENERALIZATIONS TO HIGHER ORTHOGONAL DESIGNS 


We confine ourselves here to stating that most of the more complex orthogonal designs can 
be analysed by a similar procedure. Suitable residuals from marginal means are again formed, 
and their mean range @ provides an estimate of o,. The approximate y-distribution of @ is 
again based on its exact mean and variance, and the determination of the latter does not 
require any additional formulae or tables beyond formula (19), Table 4 above, and Table | 
of Patnaik’s (1950) paper. The analysis of the split-plot design is particularly convenient. 
Before proceeding with these details, however, it seems desirable first to check the accuracy 
of the approximations involved in more detail. 


REFERENCES 
Cox, D. R. (1949). J. R. Statist. Soc. B, 11, 101. 
HARTLEY, H. O. (1948). Biometrika, 35, 417. 
Jounson, N. L. (1948). Biometrika, 35, 80. 
Lorp, K. (1950). Biometrika, 37, 64. 
Patrnatk, P. B. (1950). Biometrika, 37, 78. 
Prarson, KK. S. (1932). Biometrika, 24, 404. 
Prarson, E. 8. (1950). Biometrika, 37, 88. 
Pearson, E. 8S. & Hartiry, H. O. (1943). Biometrika, 33, 89. 
Pearson, K. (1931). Tables for Statisticians and Biometricians, Part [1. Biometrika Office. 
SNEDECOR, G. (1937). Statistical Methods. Ames, Iowa: Collegiate Press. 
Witson, E. B. & Hitrerry, M. M. (1931). Proc. Nat. Acad. Sci., Wash., 17, 684. 


~ ww = © -« BS 





can 
ned, 
Ww is 

not 
le | 
ent. 
acy 


[ 281 ] 
ON THE COMPARISON OF ESTIMATORS 
By N. L. JOHNSON 


1. Various methods have been proposed for the comparison of different estimators of a 
statistical parameter. Each of these methods is based on some intuitive idea of what property 
of an estimator is to be regarded as critical. We shall be concerned with the criterion of 
‘closeness’ proposed by Pitman (1937), and also with the criterion known as ‘mean square 
error’ (e.g. E. Johnson, 1940). 

Pitman’s criterion is based on the following considerations. Suppose Y, and Y, to be two 
estimators of a parameter 9. The probability that Y, is closer than Y, to 7 is 

Pr. {| ¥,—9|<|%i-—7|}- 
Pitman suggests that the numerical value of this probability be taken as an index of com- 
parison of the estimators Y, and Y,. In particular, if the value of the index were greater 
than }, the comparison would be taken to be in favour of Y,; if less than 3}, in favour of Y,.* 

The mean square error of an estimator Y of 7 is defined as &[( Y — 9)*]. Using this criterion, 
estimators with smaller mean square errors are considered better than those with larger mean 
square errors. 

2. In the case of unbiased estimators the mean square error criterion is equivalent to that 
of minimum variance. The relations between the criteria of closeness and minimum variance 
for unbiased estimators have been discussed by Geary (1944) and Landau (1947). Geary 
has shown that the two criteria are equivalent for pairs of unbiased estimators with a 
bivariate Normal distribution. Landau introduces the idea of asymptotic closeness: Y, is an 
asymptotically closer estimator of 9 than is Y, if 


lim Pr. {|—-9|<|%i—9|}>4 


(n denotes the sample size on which ¥,, Y, are based). He shows that under certain conditions 
asymptotic closeness and relative efficiency are equivalent. 

In the present paper allowance will be made for the effect of biases in the estimators. It 
might be objected that if biases are known to exist it would be better to make the estimators 
unbiased by subtracting the biases from them, rather than to allow for the biases in their 
comparison. However, it is often the case that the existence of bias is expected or suspected, 
but no precise idea of its amount is available (e.g. Cochran, 1942; Hasel, 1942; Madow & 
Madow, 1944; Mann, 1945). In such cases the comparison of possible estimators in a number 
of hypothetical, but likely, circumstances will give useful information on their relative merit. 

3. It is easy to see that the evaluation of Pitman’s criterion can be based on the dis- 
tribution of the ratio (Y,—)/(¥,—7). In fact 


Pr.{\¥%x—9| <|%-9)} = Pr[-1< po ‘ (1) 
Now let E(¥;—) = 96; (i = 1,2), (2) 
so that d,, 6, are the biases of the estimators Y,, Y, respectively, and denote the standard 
deviation of Y; by 7; (¢ = 1, 2). Then 
La (22) ¥,+ (¢/2) (3) 
a) ¥,+(3,/0,)’ 


Y,-9 
* The value of Pr.{| ¥,—79]| = | ’,—¥%|} will be supposed zero in this paper, which deals with con- 
tinuous variables. 











282 On the comparison of estimators 

where Yi=(¥i-—9-46)/o, (¢ = 1,2). (4) 
Y, and Y, are the standardized variables corresponding to Y,, Y, respectively. The criterion 
(1) may now be written 


(8) 


where v = {Y3+ (6:/02)}/{¥1 + (6,/0y)}. 
This is evidently a function of 
(i) the ratio of the standard deviations (c,/¢,), 
(ii) the standardized biases (8,/0,, 6/a2), 
(iii) parameters of the joint distribution of the standardized variables Y;, Y3. 


4, In the special case when the joint distribution of Y{ and Yj, is bivariate Normal, the 
only additional parameter introduced under (iii) of the preceding paragraph is p, the correla- 
tion between Y; and Y; (and, of course, between Y, and Y,). The random variable v is then 
the ratio of the two Normal variables, and the results of Nicholson (1941, 1943) concerning 
the distribution of such a ratio may be used. Using the notation of Nicholson’s first paper 
we have, for any numbers A < B, 


Pr. {4 <v< B} = Q¢z)— Ada), (6) 


i+p)W-1 1 = a) —1 
oe eee ee 
| 0; 
Q¢9) =£ atoms {a —e-™) +2(1—e-™ —e-™ m) cos? 


7 





2.4 m* 
—e-Mm — e-M yy — p-—mM__ 4 
+35 (1 e e™m—e 57) °°8 6+. (8) 


7 ~- al) -QE)C)} ° 


In his second paper Nicholson provides tables of a function V(h,q) such that 


@(9) = £ + 2V(psing, peos 9), (10) 
where p = (2m). (11) 
Using these tables the criterion (1) may be evaluated with fair rapidity. We have 





OP) 
= (Ag/m) +2AV, (12) 
_12 V(1—p?) (0/0) 
a Se oa 1 
where A¢ - Porlos $-olo tan co (0; |e)? ’ (13) 
and AV = V(psin $y ,j¢y P08 Pos/o,) — V(psin d_g,jq,, P CO8 P_g1/g,): 


If 6, = 6, = 0 then AV = 0, leading to the formula given by Geary (1944). 


5. According to Pitman’s criterion, the estimator Y, is better or worse than the estimator 
Y, according as Pr.{|¥,-—9|<|%i-g|}> or <}. 


N. L. Jonnson 283 


The case Pr. {| ¥,—7|<|¥,—7|} = 4 is thus a boundary between ‘better’ and ‘worse’ in 
this sense. Figs. 1-4 show sets of values of p, 6,/0,, 6,/0, and o,/c, satisfying this boundary 
condition. Each diagram corresponds to a different value of the ratio o,/0,. Each curve 
corresponds to a fixed value of R = (é,/c,)/(8,/0,) and gives boundary values of | é,/o,|, 
for the given fixed values of o,/o, and R, as a function of p. The curves have been drawn for 
positive values of p and R only. In order to make them complete they should be extended 
back to p = — 1 (changing the sign of R can be allowed for by changing the sign of p instead). 
The appropriate values of | 6,/o, | for given values of R and o,/o,, when p = — 1, are shown in 
Table 1. The curve R = 0is always symmetrical about the axis of | 6,/o, |; curves ‘R = const.’ 


1) 


3) 


or 


may intersect each other in the range —1<p<0. 





































































































—~ 5 (eee « z R 
6 iy V = Dale) ovis oT v 
~~ 105 07 & 15- 05 
= 4 4, eee 
ee 06 ou = 
- os 3s nes i 
Go: —— 
oe a 02 g 10- a 
a 00 ” 02 
a 7 00 
00 os_ 
0 02 4 O68 10 oan’ o' 6 O'R 
Seale of p Scale of p 
Fig. 1. o,/7,=0°8. Fig. 2. o,/o,=0-6. 
R 
2 Le ¥ 
4 03 
© 20- 40 p 
eo inane — 
= 2 f J y, 
5 al ort Ae a ei. oT 
2 4 no 2 
7 © - . dnidtar cl ates 
a — a 00 
4 ” = 
10 30 
oo} on ' of | O68 ©6008 «(10 A a ee 
Scale of Pp Scale of P 
Fig. 3. o,/0,=0-4. Fig. 4. o,/07,=0-2. 
Table 1. Boundary values of | 8,/0,| when p = — 1-0 
R | | | 
0-0 0-1 0-2 H 0-3 0-4 0-5 0:6 0-7 
0,/c, et n 
0-2 3-27 3-30 — | — _ — — — 
0-4 1-47 1-42 141 | 1-48 — ~ ee oe 
0-6 0-79 0-76 0-73 0-72 0-72 0-76 — — 
0-8 0-38 0-36 0-34 0-33 0-32 0-31 ‘| 0-32 0-33 





























It will be noted that R <0,/0, in all cases because, if | R| >o,/o, then 





Pr. {|%—9|<|¥i—[}<4, 
whatever the values of | 6,/a,| and p. 











284 On the comparison of estimators 
In all cases, if the point (p, | 6,/c, |) fails above the appropriate curve, then, according to 
Pitman’s criterion, the second estimator (Y,) is better than the first (Y,). 


6. If the mean-square error criterion is used, Y, is considered a better or worse estimator 
of 9 than is Y, according as &[(¥,—7)*] is less than or greater than &[(Y, —7)?]. Since 
E[(¥,—9)"] = 83+ 0%, 
it follows that boundary values of 3,/0,, 6,/0, and o,/o, (according to the mean square error 
criterion) satisfy the relation 624 0% = 83+03, 
: oF ) 1—(0,/0,)? 
i.e. =) = ———- 14 
E (o,/o,)?— R? (4) 
Diagrams corresponding to Figs. 1-4, but based on the mean-square error criterion, would 


consist of lines parallel to the p-axis. The values of | 6,/0, | determining the positions of 
these lines are given by (14); certain such values are shown in Table 2. 


Table 2. Boundary values of | 8,/0,| (mean square error criterion) 








| R| , 
ain 0-0 0-1 0-2 0-3 0-4 0:5 0-6 0-7 
0-2 4-90 5-66 — — ot — _ _ _— 
0-4 2-29 2-37 2-65 3-46 = ve > — 
0-6 1:33 1:35 1-41 1-54 1-79 2-41 rs _ 
08 0-75 0-76 0:77 0-81 0-87 0-96 1-13 1-55 



































It will be noted that, for given values of R and o,/02, the values of | 6,/0, | shown in Table 2 
are well above the corresponding curves in Figs. 1-4. It thus appears that for Normally 
correlated estimators, the mean-square error criterion is more favourable to the estimator 
with the smaller standard deviation than is the closeness criterion. 


7. A problem to which the criteria described above might be applied is suggested by an 
investigation of Hasel (1942) into the volume of forest timber. A forest area was divided for 
administrative purposes into a number of ‘strips’ of known but unequal area. Measurements 
of the volume of timber y; on each of a certain number of strips of area x; (i = 1, 2, ...,) were 
made. The requirement was to estimate the total volume of timber from a knowledge of the 
observed values (x,, y;) and the known overall distribution of strip area. Two simple alter- 
native estimators Y, and Y, (see below) were suggested. The data would not be adequate to 
determine the actual form of relationship between timber volume and strip area, but com- 
parison of the two estimators can be made on the assumption, for example, of a parabolic 
regression. 

Formally, the problem may be stated as follows. 

The mean (;) of a character Z in a population IT is known. It is desired to estimate the 
mean (7) of a character Y in the same population, using n pairs of observed values (2;, y;) 
(¢ = 1,...,m) respectively. It is assumed that x; and y; are connected by the quadratic 
regression equation Y, = a+ Pa, t+ yx? +z, 
where a, # and y are unknown constants and z; is a Normal variable with expected value 
zero and standard deviation o. 


ig to 


ator 


rror 





N. L. JonHnson 285 


It is desired to compare two estimators of 9, viz. 


= mi(Eu)/(Ba) and Y= 7404-2, 


n n n n 
where P= nS yy Taw Zey b= 3S —-Dy] Bee 
i=1 i=1 i=1 i=1 
If u, denote the sth moment of 2 about zero, then the mean of Y, which is to be estimated, 
” 9 = &+ Puy t+ Yh. 


The relative merits of Y, and Y, will depend on the values of a, # and y, and also on the 
population distribution of 2. The latter may be known more or less exactly, while the relative 
values of «, £, y may be less definitely known. The following discussion indicates how such 
knowledge may be used in assessing the likely relative merits of the two estimators. 

It can be shown that, in the notation of this paper, 

ble (BE) 


; = 
1% CH nz 


6, 7 Jn 1 2 rs ’ nu, —%)* + 
sa [Batt Bail (4S a 
where v = 2% + (X(x,—Z)§) (X(a,—Z)*)“, 
Oy _ Ay ae 





mm, —Z)?\-* 
o= (saa 
Suppose now that it is known that 7; = 45, ~; = 2250, and that we observe n = 25, % = 60, 
x(x; —%)? = 10,000 and &(x;—%)* = 100,000. Then 
g,/o,= 0:6, p=0-8, 3,/o, = (5000y—§a)o—}, 4/0, = —800yo—, 
R = 480y/(a—3000y). 
Y, will be superior to Y, according to both criteria when | R | > o,/a9, i.e. when 
2200 < a/y < 3800, 


whatever be the value of oc. 
For any other value of a/y, Y, will be a closer estimator of 7 than Y, for values of « above 


some critical limit, and Y, will be closer than Y, for values of o below this limit. 

The mean square errors of Y, and Y, will be equal when 

0-642 = (22007 — a) (3800y—a). 

Here again, for a given value of a/y outside the interval 2200-3800, there will be a critical 
value of o above which Y, is superior to Y, according to the mean-square error criterion and 
below which Y, is superior to Y,. This critical value wil! of course be less than the corre- 
sponding critical value when the closeness criterion is used (see end of §6 and diagrams). 

8. An interesting example of the use of the two criteria described above is provided by 
the following extension of Geary’s (1944) treatment of a problem proposed by Schrodinger. 
The problem is, essentially, given r independent random variables 2:,, x2, .... 7, with 


pla) =— (0<2,<n) 


=0 elsewhere, 
it is required to obtain an estimate of n. 








286 On the comparison of estimators 


Geary shows that the closest estimate (according to Pitman’s definition) is 2 = 2’w, 
where w is the greatest of 2,, 2%», ..., 2, 
The mean-square error of Aw, where A is a constant, is 


om” a se (1 -*)' 
(r+ 1)? (r+ 2) r+1}) }° 
This is a minimum when A = (r+ 2)/(r +1). 
The closest estimator #, and the estimator n’ = w(r + 2)/(r+1), will now be briefly com- 
pared. It is easy to show that 


at re = lr r+2\" 
Pr.{|r-n|<|n'—n]} 2(2 tory (r <3) 
r+2\~* 
=l]-— EE 
1 2(2 +r) (r>3). 


Table 3 shows (a) the ratio of mean-square error of fi to that of n’, and (b) 

Pr. {|i—n| <|n’—n]}, 
for various values of r. This table demonstrates how the ‘relative merit’ of the estimators 
depends on the criterion used in their comparison. 























Table 3 

r (a) (b) 
1 1-333 0-571 
2 1-029 0-530 
3 1-001 0-505 
4 1-002 0-509 
5 1-008 0-519 
10 1-036 0-542 
20 1-061 0-556 
0 | 1-094* 0-571 

* 14(1—log, 2). + 1—(2e)-+. 


The maximum-likelihood estimator of n is w. It is remarkable that this estimator appears 
in a rather poor light if compared with % and n’ according to either of the criteria described 
in this paper. The ratio of the mean square error of w to that of n’ is (2r+3)/(r+2), which 
tends to 2 as r tends to infinity. It can also be shown that 


lim Pr. {| #-n | <|w—n |} = 1/,/2 = 0-707 
vnd lim Pr. {| n’-—n| <|w—n|} = 1/,/e = 0-607, 
r—>o 
i.e. both # and n’ are asymptotically closer to n than is w. This does not conflict with Landau’s 
theorem (see § 2), since fi, n’ and w are all efficient estimators of n. 
fi, n’ and w are all biased estimators of n. An unbiased estimator of n is 
n” = (r+1)w/r. 

The ratio of the mean square error of n” to that of n’ is (r + 1)?/r(r + 2). This is greater than 
unity because the smaller variance of n’ more than compensates for its bias. We note also that 
lim Pr. {| R—n | <|n”—n|} = 1—(2e)-* = 0-571, 

ro 


lim Pr. {| n’-n|<|n"—n|}=1-e+ = 0-632, 
To 


lim Pr. {| w—n|<|n’—n|}=1-e+ = 0-393, 
ro 





w, 


rn- 


N. L. Jounson 287 


9. It is doubtful whether it is ever justifiable to assign any absolute optimum property 
which shall be used automatically in deciding between different estimators. It is, of course, 
possible to say that on utilitarian grounds a ‘cost function’ f( Y — 7), with the error of estima- 
tion (Y —y) as argument, may be defined, and that Y should be chosen so as to minimize 


the ‘average cost’ |” f(Y —4) p( Y)dY. The definition of f( Y — y) in any particular case is, 


however, likely to be a matter of considerable trouble. Nevertheless, objective criteria are 
certainly useful as data to help in the selection of an estimator. This being so, it is desirable 
to bear in mind just what properties of estimators the various criteria summarize, and how 
pertinent they are to any particular problem. 

On the whole, comparison of mean-square errors appears to be somewhat more satisfactory 
than Pitman’s criterion of closeness. Use of the mean-square error seems to allow for the 
actual size of error in a more precise way than does the closeness criterion. If the cost function 
f(Y —7) referred to above be taken proportional to ( Y — 7)? we are led to the mean-square 
error criterion ; the closeness criterion does not seem to be related to any particular form of 
cost function. Finally it should be noted that mean-square error ranks estimators in a unique 
order of merit while, as Pitman (1937) pointed out, it is possible for Y, to be a closer estimator 
of 7 than Y,, and Y, closer than Y;, yet Y, closer than Y,. If the estimators may be biased, this 
can happen even with independent Normally: distributed estimators. If all three estimators 
are unbiased and they are Normally correlated then closeness does define an order of merit 
(Geary, 1944). On the other hand, it is in general possible to define sets of three unbiased 
estimators with a circular closeness relation. 

It must be admitted that the closeness criterion does provide information not given by the 
mean-square error. In particular, it is always possible to use the closeness criterion, regardless 
of the existence or non-existence of the first and second moments of the estimators. Ideally, 
of course, it would be desirable to use both criteria (and, indeed, others such as the mean 
absolute error (Hamaker, 1949)). Here again, however, difficulties of computation militate 
against the use of the closeness criterion. 

In the foregoing discussion the condition of unbiasedness has entered only indirectly. It 
is sometimes claimed that avoidance of bias is of special importance because bias is not 
removed when estimates from successive samples are combined, while variability is usually 
reduced. It must be remembered, however, that this argument applies only where the same 
parameter is to be estimated over and over again and the bias is at any rate preponderantly 
of constant sign from one estimate to another. Such a case could be formally treated using 
the methods of this paper applied to single estimators based on large samples divided into 
smaller subsamples. 


REFERENCES 


Cocuran, W. G. (1942). J. Amer. Statist. Ass. 37, 199. 
Geary, R. C. (1944). Biometrika, 33, 123. 

Hamaker, H. C. (1949). Statistica, 3, 209. 

Haske, A. A. (1942). Ann. Math. Statist. 13, 179. 
Jounson, E. (1940). Ann. Math. Statist. 11, 453. 
LANDAU, H. G. (1947). Univ. Pittsburgh Buil. 43, 143. 
Mapow, W. G. & Mavow, L. H. (1944). Ann. Math. Statist. 15, 1. 
Mann, H. B. (1945). Ann. Math. Statist. 16, 85. 
Nicnotson, C. (1941). Biometrika, 32, 16. 

NicnHotson, C. (1943). Biometrika, 33, 59. 

Prrman, E. J. G. (1937). Proc. Camb. Phil. Soc. 33, 212. 














[ 288 ] 


A RAPID METHOD FOR ASCERTAINING 
SERIAL LAG CORRELATION 


By GORDON D. GIBSON, University of Chicago 


1. INTRODUCTION 


The search for a mechanical means for rapidly assessing the similarity between two time 
series, and particularly for matching series of tree ring width measurements, has led the 
writer to construct a photoelectric device for the comparison of time series in the form of bar 
graphs. This paper is a discussion of a method and some problems of serial lag correlation by 
photoelectric measurement of such bar graphs. Portions of the theory developed in this 
connexion are also relevant to the rapid calculation of serial lag correlation by ordinary means. 

The study of tree ring width series, called dendrochronology, had its inception chiefly in 
the researches of the astronomer and climatologist, A. E. Douglass (1919, 1928, 1936). 
This study makes use of the fact that trees of the same species growing under similar con- 
ditions within a given climatic region agree to a considerable extent in the variations in width 
of their annual growth rings. By matching or ‘cross-identifying’ overlapping series of ring 
widths, extending back from living trees to ancient specimens collected from prehistoric 
sites in the south-western United States, Douglass and his colleagues have established the 
average tree ring width record for nearly 2000 years past. This record gives valuable infor- 
mation concerning the climatic fluctuations of the period. By matching the rings of timbers 
from archaeological sites with the master ring width record at the proper position, the 
approximate year of construction for a great number of prehistoric habitations in this region 
has been determined. 

The problem of cross-identifying two tree ring width series is the problem of determining 
the relative position in which they are most similar, and in which the similarity is, with high 
probability, due to the same climatic factors rather than either to mere chance or to harmonic 
features displaced by an integral number of cycles. 

This problem is analogous to that of discovering the lag between two time series in economic, 
physical, and other studies. In the latter type of serial correlation problem the time bases 
are known for each series, i.e. each measurement corresponds to a known point on the time 
scale, but the lag remains to be discovered. Such problems might be termed dyschronous. 
In the dendrochronological problem the time base of one series is unknown, but the lag is 
known to be zero. By matching the two series, the time base of the unknown series may be 
determined from the known series. Such problems might be termed synchronous. The 
match position in synchronous serial correlation corresponds to the lag position in dyschronous 
correlation. 

The standard statistical correlation procedures are much too time-consuming for use in 
cross-identification where large numbers of tree ring series must be compared at innumerable 
match hypotheses. The photoelectric method of comparing bar graphs of time series is 
designed specifically for the rapid computation of measures of similarity between such series. 
It is equally applicable to dyschronous serial correlation, and to autocorrelation by employing 
pairs of identical graphs. 








Gorpon D. Gipson 289 


The photoelectric method of assessing the similarity between a pair of time series derives 
from the common practice of comparing them by drawing their line graphs on trans- 
parent surfaces, superimposing the graphs, and moving one graph back and forth along 
their common base until the closest apparent similarity is observed. Variations upon this 
method have appeared in solutions of the dendrochronological problem referred to above. 
In the first of these, that employed by Douglass, one of a pair of bar graphs is slid along beside 
another for visual comparison. These ‘skeleton plots’ record, ordinarily, only the narrower 
rings. Plotted bar height is roughly inversely proportional to ring width, and is based on 
visual assessment of the narrowness of a ring relative to those on either side. Excessively 
large rings also may be noted as secondary characteristics of a ring pattern, but these are not 
recorded as bars on the graph. In the semi-arid south-western United States the narrow rings 
are held to be diagnostic because they reflect drought conditions which are widespread and 
affect nearly all trees. Douglass’s method, in effect, recognizes that visual methods hang 
most heavily upon the major divergences and ignore the minor ones. It is generally sufficiently 
accurate in the south-west, in which area most dendrochronological work has been done. 

A recent variation on the method of visual comparison employs bar graphs of all ring widths 
drawn from actual measurement and expressed as deviations from a shifting mean (Gladwin, 
1940). This appears to be a more objective and a more accurate method of cross-dating. 
Increased accuracy is particularly necessary in regions where yearly climatic variations are 
not so strongly marked. In this method the similarity at likely points of cross-matching is 
expressed as a coefficient, but the particular measure of similarity devised by Gladwin has 
certain inherent faults which the writer believes will occasionally lead to spurious results 
(Gibson, 1947; see also O’Bryan, 1948). 

In time series comparisons it is common practice to search for the position of maximum 
similarity by visual comparison of graphs and then to express the degree of similarity in 
mathematical terms by deriving a coefficient of correlation. But there are several objections 
to this procedure: 

(1) Visual methods frequently are attempts to discover the position of maximum agree- 
ment, while they should be equally concerned with finding the position of minimum dis- 
agreement. It will be demonstrated below that one is not always merely the inverse of the 
other. 

(2) Visual methods usually attempt to make the major variations agree and largely ignore 
the moderate variations. Thus a large part of the data is unused. 

(3) The two series under graphical comparison are generally not comparable, either with 
respect to the position of the mean lines which should be congruent, or-with respect to the 
deviations from the means which should be in comparable units. The customary mathe- 
matical correlation methods do compare the series as if their mean lines were congruent and 
as if each series were expressed in standard units. Purely graphical methods ignore these 
requirements, but though hidden they exert their influence. But even if the series are 
prepared for graphical comparison to satisfy these two requirements, they will be accurately 
prepared for only a single position of comparison and for the full series of each. The two series 
must, of course, be of equal length. If one series is shifted with reference to the other by a 
single interval, the complete series are no longer being compared, for one measurement has 
been dropped from one series at one end and one from the other series at the opposite end. 
The means from which the deviations are computed are no longer the means of the spans 
veing compared. The greater the shift the more items are dropped and the greater the in- 

Biometrika 37 19 











290 A rapid method for ascertaining serial lag correlation 


accuracy. This error, which I term the error of imperfect series, together with an approximate 
correction for it, will be dealt with later in detail, for it occurs in the proposed scheme as well 
as in visual methods, although it is generally ignored in the latter. 

(4) If it is difficult to decide by inspection between several possible match positions, 
mathematical methods must be resorted to for each match hypothesis. This is often a lengthy 
procedure. 

When mathematical statements of serial correspondence have been derived, either as an 
aid to choosing between lag hypotheses or as a final expression of the degree of correspondence, 
they have usually been the familiar second-moment correlation coefficient r. A less well- 
known measure, but one that has been proposed as better suited to time-series comparisons, 
is the coefficient of first-moment correlation C,. The advantages of first-moment correlation 
are two: 

(1) It is simpler to compute, with or without mechanical aids, for there are no squares or 
cross-products involved. Rapidity for computation becomes a major factor when lengthy 
series must be compared for a great many lag hypotheses. 

(2) It has certain theoretical advantages over second-moment correlation. The two 
measures of correlation agree in the order in which they will rank various normal bivariate 
distributions but differ in the order in which they will arrange distributions which depart 
from the normal. Second-moment correlation weights points which diverge from the regression 
line relating the two series in proportion to the square of their distance from that line. 
Thus, a few widely erratic points will bear more weight than a multitude of points close to 
the line. First-moment correlation weights all points in direct proportion to their distances 
from the regression line. 

It should be noted here that neither coefficient is to be interpreted as ‘percentage of 
correlation’. They are more properly considered to be merely indexes for arranging bivariate 
serial distributions in order of their degree of correspondence. Geometric interpretation of 
both coefficients is offered in a later section. 

Second-moment correlation has been much more used in statistical work because of the 
mathematical nicety of the proofs and demonstrations surrounding it, because of the ease 
with which the results may be subjected to further mathematical tests, and because of its 
familiarity. These reasons, however, should not be allowed to influence the practical applica- 
tions if they do not impair the validity of the coefficient. 

First-moment correlation lends itself readily to photoelectric assessment of the corre- 
spondence between two series, which greatly shortens the time required to compute numerous 
correlation coefficients. It should be particularly advantageous in those cases in which mathe- 
matical coefficients must be derived in order to choose between several match or lag hypo- 
theses. If coefficients of correlation are not required, as in a simple determination of the 
match or lag position, photoelectric assessment provides a method which uses all the data 


and determines the points of maximum agreement and minimum disagreement with greater 
certainty than do visual methods. 


2. THEORETICAL BASIS 
Theory of photoelectric comparison 


Bar and line graphs. A series of measurements at equal intervals may be represented by 
a bar graph or by a line drawn through the observed points. Although many types of series 


ti 


b 
d 
v 
d 


ries 


Gorpon D. Grsson 291 


consisting of measurements of variables known to change smoothly may be more or less 
accurately represented as continuous functions of their serial bases, biometric and social 
data are often more truthfully represented by bar graphs which involve no questionable 
hypotheses concerning the intervals between observations. In series such as occur in tree 
ring studies the data represent annual increments of growth and cannot be represented as 
a smooth function of time. It is not possible, for example, to estimate the increment for year 
2 from a curve drawn through the points representing the increments for years 1 and 3. The 
theory developed in the following pages assumes the use of bar graphs, but it is readily 
extensible to smooth curves. 

Measurement of similarity. If the variates are represented as transparent bars on an opaque 
background, the light transmitted by a pair of superimposed graphs may be measured photo- 
electrically. The medium chosen for this purpose in the writer’s device is 35mm. motion- 
picture film. By means of a graphing camera the variables are recorded as transparent bars 
on a film which becomes elsewhere opaque upon development. The bars are of constant 
width and variable height and lie transversely across the film. The films for two series are 
superimposed in the beam from a sufficiently constant light source, and the amount of light 
transmitted by a given span of the pair is measured photoelectrically. The light transmitted 
and the photoelectric response are proportional to the transparent area common to both 
series. As one film is shifted with respect to the other along their common axes, the light 
transmission varies, and the relative position of maximum transmission, and therefore of 
closest agreement between the series, is readily found. The displacement between the films 
at the point of maximum transmission is, of course, the amount by which one series lags or 
leads the other. Measurements of the light transmitted may be converted into coefficients 
of similarity.* It is obvious that for meaningful results appropriate adjustments must be 
made as to scale and dispersion. The usual method of making these adjustments is to divide 
the deviations of a series by the standard deviation of the series or by the average deviation 
of the series. Coefficients of similarity may then be compared in relation to a fixed scale. 

The number of bars included in a span at a single assessment depends on the width of the 
bars and the dimensions of the apparatus. Spans up to 100 bars in length may be compared 
in the device constructed by the writer. This length of span is usually sufficient for accurate 
determination of the lag or match position between two series if the correlation is moderately 
high. Measurements of successive spans can be added to determine the agreement of longer 
series. 

Although a coefficient of agreement may be derived from a single transmission reading, 
a truer statement of the correspondence between two series requires also a measurement of 
the disagreement between them. For this purpose a second reading of light transmission is 
required. The problem is perhaps best approached through a derivation of the first-moment 
correlation coefficient and its determination from the transmission readings. 

Definition of the first-moment correlation coefficient. The first-moment correlation coefficient 
as defined by Davies (1930) is here adapted as follows: let D, = = | x |/n, where x is the cevia- 
tion from the mean, and let a be the deviation in average deviation terms, a = z/D,. It is 


* So far as the writer has been able to discover, this method of determining the similarity between 
two series is new in several respects. Other electrical analogical computing devices have been described 
by Gray (1931), Hazen & Brown (1940), the latter being a further development of Gray’s device, Martin- 
dale (1941), Foster (1946), and Orcutt (1948). When these devices are concerned with correlation it is 
with the second-moment coefficient, and they are therefore perforce somewhat more complex than that 
developed by the writer. 


19-2 











292 A rapid method for ascertaining serial lag correlation 


readily demonstrated that Xa = 0 and X|a| = n. Let s measure the similarity between a 
pair of values a, and a, such that the modulus of s will equal the modulus of the smaller, and 
such that s will be positive or negative according as a, and a, are the same or opposite in sign, 
respectively. This may be written 


8 = }(|a,+a,|—|a,—a,)). (1) 
The first-moment correlation coefficient is defined as 
C, = &s/n. (2) 


Relation between first- and second-moment correlation. A convenient geometric illustration 
is presented by Davies to demonstrate the relationship between first- and second-moment 
correlation. This demonstration is basic to further development, and is here adopted and 
expanded. Following Davies, we begin with a discussion of second-moment correlation. 
Let the variates expressed in standard deviation units be denoted by A, and Ag, and plotted 
as a scatter diagram. The diagonals of the scatter diagram lying at angles of 45° and 135° 
with the abscissa are drawn, the diagonal with positive slope is termed P, and the diagonal 
with negative slope Q. The perpendicular distances from a point (A,, A,) to the two diagonals 
are, respectively, P =(A,+A,)/J2 and @Q=(A,—4A,)/¥2. 

The variances of these distances for all points of the distribution are, respectively, 
op = X(A,+A,)*/2n and of = X(A,—A,)?/2n. 
When all points lie on the positive diagonal P, there is complete positive correlation, 
A,=A,, 0§=0 and of = X(2A)?/2n = 20%. 
But A being in standard units, 7, = 1 and o2 = 2. Similarly, when all points lie on the 
negative diagonal Q, there is complete negative correlation: of = 0 and a3 = 2. Thus a 
measure of the degree to which the points tend to approach one or the other diagonal may 
be written C, = }(7%— 0), and this coefficient will vary from +1 when all points lie on the 
positive diagonal to — 1 when all points lie on the negative diagonal, and will equal 0 when 
they tend equally toward either. It is easily shown that C, = r, the Pearsonian coefficient of 
correlation, for by expanding the above expressions, 


op = X(Af+2A,A,+A,)?/2n and o% = X(A?—2A,A,+ A})/2n, 


whence C, = (A, A,)/n. But o4, = 74, = 1, and the last equation for C, is identical with 
r= X(A,A,)/no 4,0 4,. 

An analogous derivation may be followed for first-moment correlation. The data are 
plotted in average deviation units; the distances from any point to the diagonals are 


p = (a,+4,)//2 and q = (a,—a,)/,/2, 
and the average deviations for all points of the distribution are 
Dy = Z\a,+4,|/(n 42), Dy = Z|a,—4; |/(n V2). (3) 
When all points lie off the positive diagonal p, there is complete positive correlation, 
a,=a,, D,=90 and D, =%|2a|/n/2 =D, 2. 
But a is in average deviation units, whence D, = 1 and D, = ,/2. Similarly, when all points 


lie on the negative diagonal q, there is complete negative correlation, D, = 0 and D, = 4/2. 
Thus a measure of the degree to which the points tend to approach one or the other diagonal 


may be written C; - (D, — D,)| 2. (4) 


| 





ire 


(3) 


Gorpon D. GrBson 293 


By substituting the expressions in equation (3) it is readily shown that the coefficients 
defined in equations (2) and (4) are identical. C, varies, as does r, from +1 when all points 
lie on the positive diagonal to —1 when all points lie on the negative diagonal, and 8 
zero when they tend equally toward either. 

It is demonstrated by Yule & Kendall (1937, p. 232) that for a normal bivariate distribution 
o7,+0% = 0%, +0%,, and that 0,0, = 04,0, /(1—1?), where o, and o, are the variances 
about the principal axes of the distribution and where o,, and o,, are the variances about the 
co-ordinate axes. But in a normal distribution D, = ,/(2/7)o,. Yule & Kendall (p. 182) 
state that this relationship is ‘ very approximately true of curves which do not differ markedly 
from the normal form’. Substituting this in the above equations and taking account of the 


fact that D,, = D,, = D24D? = D2 + D2 =2, (5) 
D,D, = Da, Da, \-1*) = (1-?”). (6) 
Solving these equations simultaneously, 
D, = V(l+r), D,=J(l-r) and r=+.,(1—D3ZD% = (Dj-D). 
Equation (4) may then be written 
= {J(l+r)— J(l—r)}/J2=+ /(1-D,D,) or (1-C})? = 1-7’. (7) 


The Gressens and Mouzon coefficient. Gressens & Mouzon (1927) define a coefficient of 
similarity. S = X|a,—a,|/X(|a.|+|a,|).* But &|a,| = X|a,| =n, and the formula may 
be simplified to 


When the correspondence between the two series is complete and positive, a, = a, and 
S = 0. When the correspondence between them is complete and negative,a, = —a,andS = 1. 
It is readily shown, using equations (4) and (5), that in a normal distribution 
= ,(1—8?)-S. 


An equally valid measure of similarity would be 7 = X&|a,+4,|/2n = D,/./2. This coefficient 
varies from +1 for complete positive correspondence to 0 for complete negative corre- 
spondence. 

A geometric illustration of three coefficients of similarity. A normal bivariate distribution 
may be represented as a three-dimensional surface in which the height from the X, X, plane 
to any point on the surface represents the relative frequency at that point, and the volume 
enclosed by any vertical cylinder bounded below by the X, X, plane and above by the surface 
represents the relative frequency corresponding to the area of the X, X, plane included. It 
is shown in Yule & Kendall (p. 230ff.) that the intercepts of planes parallel to the X,X, 
plane with the surface are similar ellipses, similarly situated. The major and minor axes of 
such an ellipse are proportional to the average (and standard) deviations measured along 
those axes. The ellipse which passes through the points on the principal axes at distances 
equal to the average deviations measured with respect to those axes may be termed the 
‘ellipse of unit average deviation’. 

The equation of the ellipse of unit average deviation for the case in which the deviations 
are expressed in terms of their average deviations (a, and a,) is easily derived by a 45° 


* The formula is given as S= X|a—6|/X(|a|—|6|), where a=our a, and b=our a,, but the negative 
sign in the denominator is obviously an error. 














294 A rapid method for ascertaining serial lag correlation 


rotation of the general equation for an ellipse with its centre at the origin and its major 
and minor axes lying along the co-ordinate axes and equal in length to 2D,, and 2D,. This 
equation is 





(a,+ Gs). (a, —a,)* by 
2D2 2D? 


By substituting the relations developed above, D, = ,/(1+1r) and D, = ,/(1—1), this becomes 
a} — 2ra,a, +a = 1—r?. (9) 


In the accompanying geometric representation (Fig. 1) the ellipse of unit average devia- 
tion has been drawn for a case of moderately high positive correlation. Both variates are 
expressed in terms of their own average deviations. The broken line circle represents the 
unit average deviation ellipse for zero correlation. 


q = p 





-1 D,J/2~, Dp//2 
{(1-P) 


al 








-1 











Fig. 1. Ellipse of unit average deviation and measures of correlation. 


q 

The three measures of similarity described above may be interpreted geometrically with 
the aid of this figure. The quantities D,/,/2 and D,/,/2 are the perpendicular projections of 
the major and minor semi-axes on one of the co-ordinate axes, and the difference between 
these projections is C,. The Gressens and Mouzon coefficient S is, of course, the projection 
of the minor semi-axis of the ellipse upon one of the co-ordinate axes. The projection of the 
major semi-axis would be an equally valid measure of correlation. By solving the equation 
of the unit average deviation ellipse for the intercepts, it may be shown that the distance 
from the centre of the ellipse to its intercepts with one of the co-ordinate axes is the quantity 
(1-1), which has been termed the coefficient of alienation. If this distance be taken as one 
leg of a right triangle whose hypotenuse is the radius of the unit deviation circle for zero 
correlation, then the quantity r is represented by the other leg, namely, the perpendicular 
from the intercept of the ellipse with one of the co-ordinate axes to the circle for zero corre- 
lation. 


jor 


Hs 


Ss Oo OM OO KY OO SS FS 


Gorpon D. Grsson 295 


Most of our knowledge of the normal distribution and of normal correlation involves 
second-moment functions. For this reason extensive use of second-moment functions is made 
here in treating the problems of first-moment correlation. 

The relationship of the several measures of similarity is easily visualized by means of the 
ellipse of unit average deviation. As the correlation changes from zero to positive, the ellipse 
contracts around the positive diagonal, D, becomes longer and D, becomes shorter, C, 
increases and S decreases. At the same time the intercepts of the ellipse on the co-ordinate 
axes move toward the centre, the distance ,/(1— 1?) becomes shorter, and the perpendicular r 
from the intercept to the cirele approaches unity. It is not possible on this basis to select 
one of the three measures, 7, C, or S, as best representing a given ellipse. It is clear, however, 
that all agree in the order in which they would arrange various normal bivariate distributions 
as to degree of correlation. 

The critical factor in the choice of a measure of similarity is the manner in which the various 
coefficients treat distributions which depart from the normal. As emphasized above and as 
pointed out previously by Gressens and Mouzon and again by Davies, neither S nor C, is 
affected by highly divergent points to the same degree as is the Pearsonian coefficient r. 
Between S and C, the choice must usually lie with C,, since it depends upon both the minor 
and the major axes of the distribution rather than upon one alone. The coefficient S is only 
a measure of the agreement present, while C, takes into account the disagreement between 
the series. In distributions which depart from the normal the lag position of maximum 
agreement is not always identical with the lag position of minimum disagreement. 

Determination of the first-moment correlation coefficient by photoelectric measurement: the 
ideal case. The derivation of the first-moment correlation coefficient from the light trans- 
mission of the bar graphs is most easily developed by considering first the perfect case in 
which the deviations are measured exactly from the means of the measured spans to yield 
the series a,, @. The imperfect case will be developed by extension of the perfect case. 

The values of a, and a, are represented by transparent bars on two film strips, either as 
deviations extending above and below a mid-line, or as deviations extending inward from 
the upper and lower edges of the effective film width, but not reaching past the mid-line. 
For determination of the amount of agreement the film strips are placed together, both 
upright, and measurement is made of the clear areas common to both strips. The clear area 
remaining upon superposition of a pair of transparent bars is equivalent in length to the shorter 
bar, designated y. When the signs of a, and a, agree, y = }(|a@,+a_|—|a,—a,|). When the 
signs of a, and a, disagree, y = 0. This may be expressed 


y = $(\a,|+|@,|—|a,—4, |). (10) 


If one of the film strips is inverted and the strips again placed together, bars of opposite sign 
will be aligned, and the clear area remaining will correspond to the amount of disagreement, 
designated z. When the signs of a, and a, agree, z = 0. When the signs of a, and a, disagree, 
z = (|a,+a,|—|a.—a,|). This may be expressed 


z= F(\a,/+|a.|—|a,+a,|). (11) 

Equation (1) now becomes s = (y—z). In a span of length n let y = Ly/n and z = Xz/n. 
Then 0, = 9-2. 

In determining 7 and z from the bar-graph film, account must be taken of the units in 

which a, and a, are expressed. Let 2k represent the effective film width in average deviation 











296 A rapid method for ascertaining serial lag correlation 


units. The maximum value of a that may be recorded is then k. In practice k is so chosen that 
the probability of encountering a value of a > k is less than 0-001. 

The proportion of film remaining clear for a single bar is y/2k or z/2k. For a span of length 
nm let the transparent proportions be Y = 9/2k and Z = 2/2k. Then 


C, = 2k(Y —Z), (12) 
and the Gressens and Mouzon coefficient is 
S = 1-2kY. (13) 
In terms of the quantities 7 and z we have from equation (3), 
D, = J2(1-2) and D,= J2(1-9). 


Since 7 is the measure of agreement, the major semi-axis of the ellipse is a complementary 
measure of disagreement, and the minor semi-axis is a complementary measure of agree- 
menit, as illustrated in Fig. 1. 

According to the preceding scheme, C, is obtained from two transmission measurements 
on the superimposed films, one made with both films upright and one made with one film 
inverted. The coefficient S is obtained from a single measurement made with both films 
upright. The inconvenience of inverting one film to determine Z may be avoided by recording 
both positive and negative values of a alternately as bars extending from the same side of 
the film. Then Z is determined by shifting one film by’a single bar width, Y at lag 1 by shifting 
it two bar widths, and so forth. This method of recording, however, allows only half as many 
items to be assessed at a single setting. 

The lag position for maximum D, and for minimum D, is found by determining the posi- 
tions for minimum Z and for maximum Y, respectively. In a normal bivariate distribution 
the lag positions for which D, is a maximum and for which D, is a minimum will be identical 
due to the relation expressed in equation (5), but they will not necessarily be identical for 
abnormal distributions in which equation (5) no longer holds. 


Table 1. A short numerical example 

















Item >. Xs xy Xs a, a, y z | 
1 4 Lae —16 —1-9 — 0-80 — 1-25 0-80 ais 
2 7 _ 1-4 -0-9 0-70 — 0-59 — 0-59 
3 6 r. ig 0-4 0-1 0-20 0-07 0-07 de 
4 10 5 4-4 1-1 2-20 | 0-72 0-72 ape 
5 5 6 | 06 2-1 — 0-30 1-38 on 0-30 
6 4 2 —1+6 —1-9 -080 | —1-25 0-80 mist 
7 7 7 1-4 31 | 0-70 | 2-04 0-70 
8 2 l —3-6 —2-9 1-80 «|. on kl 1-80 ae 
9 8 -. 2-4 1-1 1-20 0-72 0-72 aa 
10 3 i lew —2-6 0-1 — 1-30 0-07 = 0-07 
b> 56 39 0 0 0 0 5-61 0-96 
Mean 5-6 3-9 — ae _ — 0-561 0-096 
= mod —_ —_ 20-0 15-2 10-00 10-00 an = 
Mean mod — —_— 2-0 1-52 1-00 1-00 — — 
































n=10, OC, =9-Z7=0-465, S=1-—y7= 0-439. 





at 


th 








Gorpon D. Grsson 297 


A short numerical example is presented in Table 1, and the same data are represented 
as bar graphs (Fig. 2). In this figure k is chosen as 2-5, although in practice it would be made 
somewhat greater than 4 so that the frequency of values of a>k will be very small. 





y z 

a; and a2 superimposed a2 inverted and 
superimposed on a, 

Fig. 2. A short example in bar graph form. 


Correction for errors of imperfect series 

Origin and definition of the error. As mentioned in the introduction, the matching of 
graphed time series generally involves an error of imperfect series. This error results partly 
from the fact that an assessed span is not the entire span for which the mean and average 
deviation were determined, and consequently in the assessed span La+0 and xa | +1, 
It is also apparent that, in general, when running means and running average deviations are 
used, there is an error of imperfect series even when no terms are dropped, unless the series 
is re-expressed as deviations from the new mean in units of the new average deviation. 

In the following discussion primes indicate terms and parameters of imperfect series, 
e.g. values determined for an actual or hypothetical perfect series which is longer than the 
assessed span, and parameters derived from these values over the span assessed. Parameters 
derived from the original perfect series over its entire length bear also the subscript 0. 
Unprimed symbols designate the true values for the assessed span. For an assessed span let 
a’ = (X —X¢)/D,,. The following quantities may be measured photoelectrically: 

Y’ = y'/2k, the proportion of the given span which is mutually transparent for two films 
in the Y position, and Z’ = 2'/2k, the proportion of the given span which is mutually trans- 
parent for two films in the Z position. 

Equation (10) now becomes 

y’ = 4(/%|+|43|—|¢2—a3)), 
whence Y’ = (Dg, + Dg,)/4k —X| ag ~ a4 |/4nk 
and equation (11) becomes ,’ — (|a;|+|a3|—|a}+a5\), 


whence Z' = (Dy, + Da;,)/4k — Z| a; + ag |/4nk. 











298 A rapid method for ascertaining serial lag correlation 


The expression for C} in the imperfect case, corresponding to C, in the perfect case as given 
in equations (1) and (2), is defined 


Cy = X(\aq + 43|—|a4g—a4|)/2n = 7’ —2’, (14) 
or in terms of photoelectric measurements, 
Cy = 2k(Y’-Z’). (15) 


Analysis of errors of imperfect series. In order to investigate the effect on C, of errors of 
imperfect series and to devise corrections for them, it is necessary to separate the error, 
defining g as the factor error in the average deviation and h as the additive error in the mean. 
The error h is expressed in units of the average deviation of the span. Thus 


a’ = g(a+h), (16) 


where h = (X — X;)/D, and g = D,/D,;. The problem at hand is to determine whether C; 
may be corrected to O,, and under what conditions the error in C{ will be so small as to be 
negligible. The errors g and A must first be found. This may be done by direct computation 
from the original data for any given span, but a more rapid method is to determine the 
errors g and h from @’ and D,,, found either photoelectrically or by computation. This latter 
method, outlined below, is only approximate, since it involves the assumption that the 
deviations a are distributed normally for the span. 
Summing the moduli of equation (16) and dividing by n, 


9 = Dy|Daswi (17) 
summing equation (16) algebraically and dividing by n, 
g = a'/h; (18) 
and eliminating g between the last two equations, 
h| Darn = @/Dy. (19) 


In the following section a derivation of D,,,,, = f(h) is presented. With the assumption of 
a normal distribution of a, a table of h with argument a@’/D,,, may be computed. The probable 
value of g may then be determined from equation (18). 

Some saving of time may be effected by determining @’ and D,, photoelectrically from the 
film graph. Let L’ be the relative transmission of the negative portion of the span of a single 
film, i.e. the light transmission measured with a mask covering the positive portion of the 
film strip, and let U’ be the transmission of the positive portion of the span. Then 


L’ = (Z|a’|—a’)/4nk and U’ = (Z|a’|+Za’)/4nk, 
whenee a’ = 2k(U’—L'), Dy = 2k(U' +L’), 
and @'|Dy = (U'—L’)|(U' +L’). (20) 


The average deviation of a normal distribution about a point other than the mean. If t and w 
represent respectively the deviation from the mean and the error in the mean in standard 
units corresponding to a and h in average deviation units, a = ,/(}$7)t and h = ./(47) w, then 
equation (19) may be written 


Dosw “5 Dosw van = (1/n) (47) = | t+w : 





Assun 


where 


and 


wher 


Subst 


Tabl 








en 





Gorpon D. Grsson | 299 


Assuming a normal distribution of t, 


Dew = 1) * (t+w)npioat— [° (t+) ng at | 


= (47) [ | ii to(t) dt + w f “90 dt -(~ to(t) dt —w | Be d(t) at] : 


1 


where ¢(t) = Jem) e-*” the ordinate of the normal curve at ¢. But 
[° soma =—[" pwae = 00), 
and ” $(t)dt = ©(w) and | ~" $(t)dt = ©(—w) = 1- O(w), 
where ®(w) is the area under the normal curve from — oo to w. Hence 
Desw vam = V(47) [26(w) + 2wO(w) — wv). (21) 
Substituting w= ,(2/m)h, 
Doan = (277) Pf (2/77) h} + 2hO{,/(2/77) h} — h. (22) 


Table 2 gives h = f(a’/D, ). The sign of h agrees with the sign of a’. 


Table 2. Values of h 





a’/Dy | 9-00 0-01 0-02 | 0-03 0-04 0-05 0-06 0-07 0-08 0-09 





0-00 | 0-0000 | 0-0100 | 0-0200 | 0-0300 | 0-0400 | 0-0500 | 0-0601 | 0-0701 | 0-0802 | 0-0902 
0-10 | 0-1003 | 0-1104 | 0-1206 | 0-1307 | 0-1409 | 0-1511 | 0-1613 | 0-1716 | 6-1819 | 0-1922 
0-20 | 0-2026 | 0-2130 | 0-2235 | 0-2340 | 0-2446 | 0-2552 | 0-2659 | 0-2766 | 0-2874 | 0-2983 
0-30 | 0-3091 | 0-3201 | 0-3311 | 0-3422 | 0-3534 | 0-3647 | 0-3761 | 0-3876 | 0-3992 | 0-4109 
0-40 | 0-4225 | 0-4344 | 0-4463 | 0-4585 | 0-4707 | 0-483v | 0-4955 | 0-5082 | 0-5210 | 0-5340 









































Correction of g and h. From equation (16) 
A =Gi%,4+G,h, and ay =God_+ Goh. 
We define the point (aj, a3) as point (p’,q’) in reference to the p,q axes, so that 
p’ = (a,+43)/J2 and q’ = (a3—a;)/ V2. 
Equation (14) may then be written in a form analogous to equation (4): 
CO; = (Dy —Dy)| 42. (23) 
The accompanying drawing (Fig. 3) illustrates p’ and q’ and the related quantities defined 
below. The quantities p’ and q’ are positive or negative according to their position in reference 
to the p,q axes. The mean of the (aj, a3) distribution is at (g,/,, gh), or in terms of the p,q 
axes it is at the point (p’,q’). The deviations of points (p’,q’) from the mean (p’,q’) of the 
distribution in reference to the p, q axes are 
P" = (9141 +92%)/J2, 9" = (G2%2—91%)/ V2. 
The quantities p” and q” are positive or negative according to their position in relation to 
p’ and q’ in reference to the p,q axes, whence 


p=p't+p, =q' +7. 











300 A rapid method for ascertaining serial lag correlation 
The average deviations of p’ and q’ are 


Dy i =| p’+D' |/n, Dy = Z| q"+7 |/n. 


Let d"=p"|o, and d'=9'/o, 
where Cp = J(47) D,-. 
Then 


D ” Co Dor+p =D Vm) Dp’(d’+d’). 
This is an expression of the type of equation (21). Therefore 


Dy, = (47) Dy-[26(d') + 2d’ O(a’) —d’], 


and a similar expression may be written for D,. Substituting d' = (2/7) (p’/D,-), and 


treating the corresponding expression for D,, similarly, 


=o JE owel d+ 
avons IEE Jmol [Qe os 


a2 
+ 





ay = $242; + 82 i ee ain ae aie 









+ 2; 
, 
aij = 21 a1 + 21 hy 





Fig. 3. Elements of the imperfect distribution. 


Observe that the above equations are unaffected by the signs of 7’ or q’, for 
$(—x) = (x) and O(— x) = 1- (x). 
Equations (23) and (24) may be combined to give, in abbreviated form, 
Cy = (Dy, B')—f(Dy 7) V2. (25) 
Illustration of the relationship of the unit average deviation ellipse for the imperfect series 
to the unit average deviation ellipse for the corresponding perfect series is simplified by 


translation of the axes so as to make the centres coincide. A unit average deviation ellipse 
for a normal bivariate distribution with a moderately high positive correlation is illustrated 


(Fig. 4) 
multipl 
deviati 
axes of 
ellipse 

axes ar 








Gorpon D. Grsson 301 


(Fig. 4). The unit average deviation ellipse for the same correlation but with the variates 
multiplied by the factors g, and g, is shown by a broken line. D, and D, represent the average 
deviations of the perfect series with reference to the p,q axes, and are the principal semi- 
axes of the corresponding ellipse. The principal semi-axes of the centred imperfect series 
ellipse are designated A, and A,. The average deviations of this distribution about the p, q 
axes are D,,. and D,.. The equation of the imperfect series unit average deviation ellipse is 


9343 — 27g, 90,4, + 9303 = 1—r?. 
It may be shown that the equation between the sums of the variances of a bivariate dis- 


tribution taken in reference to the principal axes and to the co-ordinate axes, cited in the 
derivation of equation (5), is true for any pair of rectangular co-ordinates with coincident 




















origins. Hence D3, + D3. = D3, + D3 a, = 92+ G2 (26) 
Equations (25) and (26) are to be solved simultaneously for D3. and D3.. 
q a2 p 
i 
4 
/ 
A 
/ 
~8, —| 
/ 
/ 
% aa Le 
—" Ss ~£2 
4 \ 

















Fig. 4. Ellipse of unit average deviation for the centred imperfect distribution. 


A graphical solution for Dj. and D%.. In the graph (Fig. 5, p. 302) D,. or D, is 
represented as a continuous function of D2. or D2, at selected values of p’ or q respectively. 
A simultaneous solution of equations (25) and (26) results when a pair of points is found on 
the proper curves for p’ andq such that the sum of their abscissae is (g? + g3), and the difference 
between their ordinates is Cj ./2. Solution may be facilitated by employing an auxiliary 
grid on transparent paper, with a vertical scale ,/2 times the ordinate of the graph (Fig. 5) 
and a horizontal scaled from both sides of a centre zero line in any convenient units, as illu- 
strated in Fig. 6. The horizontal line at a distance C; above the zero horizontal is noted. This 
auxiliary grid is laid on the graph (Fig. 5) so that its zero vertical axis coincides with the line 
D2. = $(g3 +93), and is moved up this line until the points on the selected p’ and q’ curves 
lying under the C; line and the zero horizontal are equidistant from the zero vertical axis. 
Attention must be paid to the algebraic sign of (D,,,— Dy) which must agree with the sign of 
Cj. Since the curves for only certain values of p’ and q are provided, it may be necessary to 
pencil in lightly the approximate curves for p’ and 7 in.the regions required. 

The simultaneous solution of equations (25) and (26) in effect adjusts Cj for h, and hg; 
geometrically it translates the axes of the imperfect distribution to (0,0). D2. and D2. must 
next be adjusted for g, and gp. 











01 03 05 O7 OF 14 #143 °45 #47 «+19 24 23 25 
0 0-2 : 0-4 06 0-8 1-0 — 1-2 — 1-4 16 1:8 _ 2:0 _ 22 _ 24 



























































































































































































































































oot 
0 O02 04 06 O8 10 1:2 14 16 18 20 22 24 
01°03 05 OF OF9 144 #143 #45 #417 «149 «24 23 25 
Dp*, Dg’ 
Fig. 5. 
15- 
0-30 i ~ 
1-4/ Ht 
074 
_ 0-20 saneceaee 
= 12L/0 
be ‘So eee 
8 0-10 - 
145 3 
1Or] So? 34 0 40 20 
S 
s/ 3 
ooh i 

















0-987 1-250 1-531 
Scale of Dj, D3 


Fig. 6. Auxiliary grid, shown as used for C;=0-21, p’ = 0-5, 7 = 0-1, $(g?+ 3) = 1-250. 
The solution is D?, = 1-513, D3, = 0-987. 























Gorpon D. Grsson 303 
C, as a function of D,,, Dy, g, and g,. As defined above, 
P" = (9,9, +92%)//2 and gq” = (9242—9,4;)/./2. 
Squaring, summing and dividing by n, 
Op = HG1FG, +9304, +7992"), Tor = EGG, + 920%, — 79192"), 
since a = (x/o) J(47) and r = (Za,a,/n) (2/7). 
Substituting D, ./(47) for o, and noting that D,, = D,, = 1, we have 
Dp. = HGi+ 92+ 291927) and Df. = $(gi+g3—29,927). 


Solving for 7, where the wavy line represents an estimated value based on the assumption of 


normal distribution, 2 _ pz, 
7= -- ian (27) 
29192 
and from equation (7) (1— 03)? = 1-#. (28) 


Although ? appears as an intermediate step in this derivation, it is to be expected that in 
distributions departing from the normal C, is a better estimate of C, than # is of r, since both 
estimates are derived from the first-moment coefficient Cj. With the aid of a table of (1 —r?), 
such as that provided by Pearson (1930) in which the argument is given to three places, 
0, may be computed readily from 7. Or direct conversion may be made by Table 3. 


Table 3. C, = /{1— /(1—r?)}. Values of C, 





r 0-00 0-01 0-02 0-03 0-04 0-05 0-06 0-07 0-08 0-09 





0-00 | 0-0000 | 0-0071 | 0-0141 | 0-0212 | 0-0283 | 0-0354 | 0-0424 | 0-0495 | 0-0566 | 0-0637 
0-10 | 0-0708 | 0-0779 | 0-0850 | 0-0921 | 0-0992 | 0-1064 | 0-1135 | 0-1206 | 0-1278 | 0-1350 
0-20 | 0-1421 | 0-1493 | 0-1565 | 0-1637 | 0-1710 | 0-1782 | 0-1854 | 0-1927 | 0-2000 | 0-2073 
0-30 | 0-2146 | 0-2220 | 0-2293 | 0-2367 | 0-2441 | 0-2515 | 0-2589 | 0-2664 | 0-2739 | 0-2814 
0-40 | 0-2889 | 0-2965 | 0-3041 | 0-3117 | 0-3194 | 0-3271-| 0-3348 | 0-3425 | 0-3503 | 0-3582 
0-50 | 0-3660 | 0-3739 | 0-3819 | 0-3899 | 0-3979 | 0-4060 | 0-4141 | 0-4223 | 0-4306 | 0-4389 
0-60 | 0-4472 | 0-4556 | 0-4641 | 0-4727 | 0-4813 | 0-4900 | 0-4987 | 0-5076 | 0-5165 | 0-5255 
0-70 | 0-5347 | 0-5439 | 0-5532 | 0-5626 | 0-5722 | 0-5819 | 0-5917 | 0-6016 | 0-6117 | 0-6220 
0-80 | 0-6325 | 0-6431 | 0-6539 | 0-6650 | 0-6763 | 0-6879 | 0-6998 | 0-7120 | 0-7246 | 0-7376 
0-90 | 0-7511 | 0-7651i | 0-7798 | 0-7953 | 0-8117 | 0-8293 | 0-8485 | 0-8700 | 0-9850 | 0-9268 









































Approximations 
An approximate value of g. When @’ = 0, h = 0 and equation (18) is indeterminate, but in 
such a case D,;)= 1 and g =D, by equation (17). The error introduced by taking 
g = D, when @’ +0 will not be greater than 0-005 for the ranges of a’ corresponding to the 
selected values of D,. shown in Table 4, and this error in g is always positive. 


Table 4. Ranges of a’ for (D,.—g) $ 9-005 





0-50 0-60 0-70 0-80 0-90 1-00 1-10 1-20 1-30 1-40 1-50 | 





Dy 





a’ Lettie +0-096 | +0-104| +0-111/ + 0-118) + 0-124| + 0-131 | + 0-137) + 0-143 | + 0-150/| + 0-156 


An error of 0-005 in g results in an approximate error of —0-005(a’/g?) in A. This error in 
h almost always lies within + 0-005 since only very rarely is | @’ | > 0-25 or D,. < 0-50. 















































304 A rapid method for ascertaining serial lag correlation 


Correction of Ci not always necessary. When the @’ and g parameters fall within certain 
ranges, the graphical correction of Cj to (, may be eliminated. Errors in the determina- 
tion of Ci by means of the present photoelectric device preclude corrections in Cj of less 
than about 0-015. In general, errors of this size are immaterial to a location of the match 
position between dendrochronological series. Table 5 gives the maximum values of Cj for 
| O, — Cj | < 0-015 for selected combinations of g,, g, and j’, g or @;, Zz, and will serve in many 


Table 5. Maximum C, for which | C,—C;| < 0-015 


























ri 0 +0-10 0 +0-10, +0-10 
7 0 0 +0-10 +0-10, +0-10 
a; 0 + 0-07 +0-07 +0-14, 0 
a 0 + 0-07 + 0-07 0, + 0-14 
9, OF 92 92 OF 9; 
0-900 0-900 0-13 0-15 0-10 0-13 
0-900 0-950 0-18 0-19 0-13 0-16 
0-900 1-000 0-26 0-29 0-21 0-24 
0-960 1-050 0-41 0-46 0-32 0-39 
0-900 1-100 0-65 0-68 0-53 0-58 
“950 0-950 0-28 0-32 0-22 0-26 
“950 1-000 0-55 0-62 0-41 0-48 
-950 1-050 0-90 0-91 0-78 0:80 
0-950 1-100 0-93 0-93 0-88 0-88 
"O75 0-975 0-58 0-65 0-43 0-49 
Dans 1-000 0-95 0-96 0-68 0-73 
"975 1-025 0-97 0-97 0-85 0-87 
-000 1-000 a * 0-87 0-88 
-000 1-025 » * 0-92 0-92 
-000 1-050 0-64 0-56 0-93 0-94 
-000 1-100 0-34 0-30 0-41 0-36 
-025 1-025 0-61 0-54 0-94 0-94 
-050 1-050 0-31 0-27 0-37 0-33 
-050 1-100 0-22 0-19 0-25 0-22 
1-100 1-100 0-16 0-14 0-18 0-16 














* At these points the error does not reach + 0-015 for any value of Cj up to the maximum C;{ obtain- 
able in a ndrmal distribution. There is, of course, no correction when g, = g, = 1-000 and p’ = 7 = 0. 





cases as a sufficient guide in deciding whether the correction may be ‘ignored. Table 5 is 
constructed for positive values of Cj. If Cj is negative, the maximum negative value of C} 
for which |the error falls within the 0-015 limit may be found by reversing the signs of 7’ 
and q’. This table may be extended in a future publication. 

The proportion of the total range of Cj for which | C,—C;| < 0-015 is not immediately 
apparent, for the maximum value of /; in a normal distribution is in general not 1-00. The 
maximum value of Cj in a normal distribution, corresponding to C, = 1-00, occurs when 
D3. — D2. = 29,9. From equation (26) D3.+ D2. = +93, so that D3. = }(g,+9,)* and 
D.. = $(92—9,)*. For example, when g, = 0-90, g, = 0-90, p’=0 and 7’ = 0-10, then 
Cymax. = 0-83. 









































Gorpon D. GIBson 305 


3. APPLICATION 
An example from dendrochronology 


Table 6 and Fig. 7 illustrate the method and results obtained with sample dendrochrono- 
logical series. The specimens here compared are cedars, Juniperus virginiana, collected in 
Tennessee. The year corresponding to each ring is known. Ring widths were measured and 
prepared for correlation assessment by deriving the deviations from centred moving averages 
and dividing each deviation by the centred moving average deviation. The span assessed is 
100 years in length. Series 1 was held constant in span, while series 2 was shifted by an 
amount A. The correlation between the two series is low, but the coefficient reaches a maxi- 
mum at lag zero as expected. 


Table 6. First-moment correlation data and corrections for two specimens of 
Juniperus virginiana. n = 100, a, = —0-0421, g, = 1-0017 











Imperfect values Corrections 
Lag 
A ~ 
y = Cy a, Da, 92 C; 
12 0-2790 0-2738 0-0052 — 0-1064 0-9326 — 0-0032 
11 0-3063 0-2781 0-0282 —0-1124 0-9386 — 0-0270 
10 0-1982 0-3478 — 0-1496 —0-1161 0-9423 —_ —0-1571 
9 0-1925 0-3431 — 0-1506 — 0-1302 0-9468 0-9408 — 0-1586 
8 0-2645 0-2828 —0-0183 — 0-1262 0-9428 0-9369 —0-0215 
7 0-3082 0-2871 0-0211 —0-1148 0-9314 — 0-0196 
6 0-2712 0-2778 — 0-0066 —0-1317 0-9431 0-9374 — 0-0095 
5 0-2934 0-2752 0-0182 — 0-1357 0-9471 0:9404 0-0162 
+ 0-3228 0-2549 0-0679 — 0-1381 0-9495 0-9433 0-0675 
3 0-2915 0-2382 0-0533 — 0-1068 0-9808 — -- 
2 0-2727 0-2604 0-0123 — 0-0586 1-0234 —— -— 
1 0-3619 0-1911 0-1708 — 0-0543 1-0191 — — 
0 0-3752 0-1765 0-1987 — 0-0423 1-0253 — 0-1955* 
- 1 0-3454 0-2333 0-1121 — 0:0287 1-0117 — as 
-— 2 0-3459 0-2575 0-0884 — 0-0398 1-0170 -- — 
-— 3 0-3241 0-2462 0-0779 — 0-0276 1-0110 -~ — 
-— 4 0-3199 0-2364 0-0835 —0-0104 1-0040 — — 
— 5 0-3045 0-2943 0-0102 — 0-0239 0-9905 — = 
—- 6 0-3300 0-2797 0-0503 — 0-0198 0-9946 — -- 
-— 7 0-3180 0-2651 0-0529 0-0004 0-9890 -— — 
-— 8 0-2690 0-3035 — 0-0345 - 0-0162 0-9920 — a 
- 9 0-2811 0-2763 0-0048 0-0491 0-9861 — — 
—10 0-2053 0-3466 —0-1413 0-0482 0-9870 — — 
—1l 0-1738 6-3422 — 0-1684 0-0449 0-9903 — — 
—12 0-2546 0-3147 — 0-0601 0-0369 0-9983 — — 
































The values of the above table have been computed to four decimal places to illustrate the corrections. 
The author’s photoelectric assessor yields measurements to only two decimal places. 

* Correction is not required at this point, but the corrected value is given for comparison with the 
corresponding coefficient computed from perfect series. 


If a maximum error of + 0-015 is allowed, the parameters Cj, 9,, Ja, p’ and q are such that 
no correction is necessary at lags — 12 to 2. At lags 4, 5, 6, 8 and 9, (D,,—g) > 0-005, and at 
these points h was determined from Table 2 and g from equation (18). The value of 7’ falls 

Biometrika 37 20 











306 A rapid method for ascertaining serial lag correlation 


slightly beyond the limits of Table 5 for lags 3 to 12, although from the nearby point of 
9; = 1-000, g, = 0-950, p’ = 0-10, 7’ = 0-10, at which the maximum value of Cj for which the 
error falls within the limits is 0-48, it appears that the correction may be ignored here as well. 
Clearly no correction weuld be necessary if a mere determination of the match position were 
required. The corrections were made, however, and the differences between C; and 0, were 
found to be in no case greater than 0-010. The corresponding perfect series were computed 
for the spans at lag zero, and these yielded a coefficient C, = 0-1936. The discrepancy between 
this value and the corrected value 0, = 0-1955 is attributed to the departure of the dis- 
tribution from normality. 

It would appear that cases in which the present method is most likely to be useful in 
dendrochronological work will be cases of low correlation similar to that of the example 
here presented, and in the majority of such tests few corrections on the imperfect series 
coefficients are likely to be necessary. 





0:20- 7 


0-10} 4 





—0:20 








‘s A 7s 1 i 4 L L 4. i. ee 4 1 i 1 4. 1 i 4 1 1 


lag -10 5 0 5 10 





Fig. 7. Lag correlation graph of two tree ring series. 


Moving averages of 30-year periods were used in preparing the data of this example. Selec- 
tion of the proper period for the moving averages is a problem by no means solved as yet. 
The purpose of the moving average is to remove trends in mean value and in average devia- 
tion. Douglass and his followers hold that chief stress must be laid upon rings which are 
conspicuously narrow or wide in comparison with their immediate neighbours (Glock, 1937). 
This argues for moving averages of short period. Gladwin’s researches seem to contradict 
this, for he finds that a 30-year moving average results in higher similarity at the match 
position. The use of a moving average in autocorrelation serves to raise the coefficients at 
either side of the match position (Davis, 1941). This is true to a lesser extent in lag correla- 
tion between different series, and could result in an error of match determination. Distribu- 
tion of the correlation on either side of the match position is clearly noticeable in Fig. 7. 
Final choice of period for the moving average must rest on further empirical studies. 


Significance of the coefficient 
It is necessary to have a means of distinguishing a coefficient which indicates a significant 
relationship between the series from one which might arise either by mere sampling fluctua- 
tion or by mere phase correspondence between the periodic features of the two series. In 
most tree ring width series the random factors appear to be much stronger than the periodic 
factors. There has been shown to be a certain amount of interdependence between the widths 











ar 


D 
D 
D 
F 


a H4 eS .} 


al 


ae 





t of 
the 
vell. 
vere 
yere 
ited 
een 


dis- 


lin 
ple 
ries 








Gorpon D. Grsson 307 


of rings of successive years due to retention of moisture in the soil. But the adjusted ring 
width series generally are distributed rather normally. The number of degrees of freedom 
lost because of periodic factors might be estimated from harmonic analysis of representative 
series, and use then made of existing tables of the distribution of r in samples of given size 
drawn from a presumably uncorrelated universe. 

However, a more direct approach is suggested. From studies on selected series, tables of 
the empirical distribution of C, at non-match positions for spans of convenient length could 
be compiled. The labour required for such a compilation would be much reduced by photo- 
electric assessment of the correlation. No assumptions as to the true correlation in the 
universe would be necessary; in fact, this might vary between a small positive and a small 
negative value on account of periodicity. From such a table the minimum value of C, for 
a match position of given level of confidence could be prescribed. 

If the true correlation in the universe sampled in the example presented above is zero, 
and if the example is representative of the results to be expected, the occurrence of relatively 
high negative coefficients makes it appear that series somewhat more than 100 years in 
length will be required for high confidence in cross-identification where the correlation is low. 


The author is grateful to Dr John Wishart for reading a preliminary draft of this paper, 
and for his helpful suggestions. 


REFERENCES 


Davies, G. R. (1930). First-moment correlation. J. Amer. Statist. Ass. 25, 413. 

Davis, H. T. (1941). The Analysis of Economic Time Series. Indiana: Bloomington. 

Dovetass, A. E. (1919, 1928, 1936). Climatic Cycles and Tree-Growth,.1, 2,3. Washington. 

Foster, G. A. R. (1946). Some instruments for the analysis of time-series and their application to 
textile research. J. R. Statist. Soc. B, 8, 42. 

Grsson, G. D. (1947). On Gladwin’s methods of correlation in tree-ring analysis. Amer. Anthrop. 49, 
337. 

Guapwin, H. S. (1940). Tree-Ring Analysis. Methods of Correlation. Arizona: Globe. 

Gtock, W. S. (1937). Principles and Methods of Tree-Ring Analysis. Washington. 

Gray, T. 8. (1931). A photoelectric integraph. J. Franklin Inst. 212, 77. 

GressEns, O. & Mouzon, E. D. Jr. (1927). The validity of correlation in time sequences and a new 
coefficient of similarity. J. Amer. Statist. Ass. 22, 483. 

Hazen, H. L. & Brown, G. 8. (1940). The cinema integraph. J. Franklin Inst. 230, 19, 183. 

MARTINDALE, J. G. (1941). A correlation periodograph for the measurement of periods in disturbed 
wave-forms. J. Teat. Inst., Trans. 32, 71. 

O’Bryan, D. (1948). Remarks on tree-ring analysis techniques in the Southwest. Amer. Anthrop. 50 
708. 

Orcutt, G. H. (1948). A new regression analyser. J. R. Statist. Soc. A, 111, 54. 

Pearson, K. (1930). Tables for Statisticians and Biometricians, Part I. Cambridge. 

Youte, G. U. & Kenpatt, M. G. (1937). An Introduction to the Theory of Statistics. London. 


20-2 














[ 308 ] 


THE MAXIMUM F-RATIO AS A SHORT-CUT TEST FOR 
HETEROGENEITY OF VARIANCE 


By H. 0. HARTLEY 


1. INTRODUCTORY 


The most useful form of the well-known test for heterogeneity of variances at present in use 
is Bartlett’s (1937) modification M = —2log, « of Neyman & Pearson’s (1931) LZ, test and 


is given by M = Nlog,{N- > 483} — > y, log, 87, (1) 
t t 


where the & variance estimates s?(¢ = 1,2,...,4) are respectively based on y, degrees of 
freedom and N = > »,. Although tables of the percentage points of M, based on an adequate 
t 


approximation to its distribution (Hartley, 1940), are available (Thompson & Merrington, 
1946), the computation of M is comparatively laborious and often a deterrent in a rapid 
survey of data. 

It is, therefore, the purpose of this paper to investigate the use of the ratio of the largest 
of the s? to the smallest in the set, say Fax = 8%max./8°min, 28 & Short-cut substitute for M. 
Clearly, this maximum F ratio can be roughly assessed at a glance and can be gauged against 
tables uf percentage points provided here (Table 3) for a test of significance ‘without com- 
putation ’. The loss in power compared with the M testis calculated under certain assumptions 
and is found to be negligible or small. 


2. THE DISTRIBUTION OF F,,,,,, ON THE NULL HYPOTHESIS 


We confine ourselves in the first place to k variance estimates s? (t = 1, 2,...,&) all based on 
the same degrees of freedom v. On the null hypotheses these are independent estimates of 
a common variance o?. Bartlett & Kendall (1946) have investigated the normalizing 
log-transformation of s?, viz. uw = log, s?, and have shown that wis approximately normally 
distributed with varu~2/(v—1). This suggests that we may compute approximate 
percentage points of F,,,, from 


Froax, (%) = exp {w,() /[2/(v— 1)]}, (2) 
where w,(«) is the 100a% point of the ‘range’ w, in independent normal samples of 
size k, values of which have been tabulated (Pearson & Hartley, 1942). In order to check 
and improve the accuracy of this approximation,* two comparisons with the exact nul dis- 
tribution are made: 

(a) Casek = 2. We have 


P{etaas./8*nin,> F} = Plot/sh> F) + P(st/st <5), (3) 


so that the upper 100« % points of F,,,, are given by the upper 50a % points of the F dis- 
tribution for degrees of freedom (v, v). In Table 1 we give the upper 5 % points (a = 0-05), 


a2 
* The use of the exact variance of u = logs’, viz. Gat Be T'(x)| en, (see Bartlett & Kendall, 1946), 


would improve the 5 % points for small v and k, but would grossly underestimate the larger percentage 
points. 














Th 


ex 


wi 
in 








~~ Sp 





H. O. Hartitey 309 


both from the exact F distribution and from the approximation (2). The latter is seen to be 
adequate for v > 4 but needs adjustment for vy = 2,3. It is clear from this table that a check 
against exact results for small v and all k is needed. We therefore make a second comparison. 


Table 1. Comparison of exact and approximate 5 %, points of Fina, for k = 2 





v= 2 3 4 5 6 7 8 9 10 12 15 20 





Approx. | 50-3 | 16-0 | 9-61 | 7-13 | 5-76 | 4-95 | 4-39 | 4-02 | 3-71 | 3-27 | 2-86 | 2-46 
Exact 39-0 | 15-4 | 9-60 | 7-15 | 5-82 | 4-99 | 4-43 | 4-03 | 3-72 | 3-28 | 2-86 | 2-46 















































(6) The exact distribution of Fy, for v = 2. Given k random mean squares all based on 
2 degrees of freedom, there are & choices for s*,,;,, so that for a given x the chance for 
@< 87 in <2+dae and all s?< Fx is given by ke-*(e-*—e-¥*)*-‘dz. Hence we have for 
the probability integral of F,,,. 





P = Plus. <F) = k{ “e-* (e-* —e-¥#)4de. (4) 
0 
Introducing y = e-**, we find 
1 k-1 #% pom -—1 
p= | (l—yF-Dkykady = > ¥ ‘yo *i+1) “ (5) 
0 i=0 


For large F (F >k) percentage points F(~) can be computed from the rapidly converging 


iteration ae oe 
Fula) = 1-25 (FF ') (- G+ B@)— vr, (6) 


and with a good starting value for F,(«) the first iterate from (6), F,(~), is already the precise 
percentage point F(a). In Table 2 we have set out these exact percentage points for « = 0-01 
and 0-05, and, in the case of 0-05, they are compared with the approximate percentage 
points computed from (2). 


Table 2. Exact upper 5 ani 1 % points of the distribution of Fry, in a set of k mean squares, 
all based on 2 degrees of freedom ; approximate upper 5 % points computed from equation (2). 


k=number of mean squares. 





2 3 4 6 7 8 9 10 ll 12 


59 Approx. 50-3 | 108 | 170 | 235 | 299 | 364) 431 497 | 560 | 623) 688 
7 Exact 39-0 | 88] 142} 202| 266) 333 | 403 | 475 | 550; 626| 704 











1 % Exact 199 | 448 | 729 | 1036 | 1362 | 1705 | 2063 | 2432 | 2813 | 3204 | 3605 












































The two sets of exact results were finally used to adjust the approximation (2) by an 
expression of the type F(a) = F(a) (1 +99). (7) 
appr. 


with q, and q, fitted to the exact values at k = 2and v = 2. The adjusted values are tabulated 
in Table 3. For k = 2 and v =: 2 the values are exact, and for vy > 4 the adjustments from (7) 











310 Maximum F-ratio as a short-cut test for heterogeneity of variance 


were so slight that the answers shown in Table 3 agree virtually with the approximations 
computed from (2). 


Table 3. Upper 5 % points of Prax. = 8*max./8*min, i” @ set of k mean squares, 
all based on v degrees of freedom 








2 | 39-0 87-5 142 202 266 333 403 475 550 626 704 
3 | 15-44 | 26-6 36-8 46-9 55-1 63-8 72-1 80-5 87-4 93-7 | 101 
4 9-60 | 14-8 19-2 23-1 26-7 29-9 33-2 36-2 38-5 40-9 43-4 
5 7-15 | 10-4 12-9 15-3 17-3 19-1 20-7 22-2 23-6 25-0 26-0 


6 5-82 8-14 10-0 11-6 12-9 14-1 15-0 16-1 17-0 17-8 18-5 
7 4-99 6-81 8-13 9-34 10-3 11-1 11-9 12-6 13-2 13-7 14-4 
8 


4-43 5-90 6-99 7-88 8-61 9-32 9-90 10-4 10-8 11-4 11-8 

9 4-03 5-22 6-13 6-90 7-55 8-10 8-59 8-95 9-31 9-68] 10-1 
10 3-72 4:77 5°54 6-18 6-70 7-18 7-55 7°86 8-17 8-50 8-85 
12 3-28 4-10 4-71 5-21 5-58 5-93 6-23 6-49 6-69 6-96 717 
15 2-86 3°49 3-94 4-31 4-57 4-86 5-05 5-21 5°37 5-58 5-76 
20 2-46 2-92 3°25 3-49 3-71 3°86 4-02 4-14 4-26 4-35 4-48 
30 2-08 2-39 2-59 2-75 2-89 3-00 3°10 3°16 3°22 3°29 3°35 
60 1-67 1-84 1-95 2-03 2-10 2-16 2-20 2-25 2-29 2-32 2-34 
fo) 1-00 1-00 1-00 1-00 1-90 1-00 1-00 1-00 1-00 1-00 1-00 












































3. EXAMPLE OF THE USE OF TABLE 3 


Below are given six mean squares all based on 8 degrees of freedom: 10-1, 36-2, 47-8, 5-1, 
27-6, 31-2. The largest mean square is 47-8 and the smallest 5-1; their ratio clearly exceeds 
the upper 5 % point for vy = 8 and k = 6 which is 8-6i. Heterogeneity is therefore indicated. 


4, THE APPROXIMATE COMPARISON OF THE POWER OF THE M- anv F;,,, -TESTS 
FOR A LOGNORMAL DISTRIBUTION OF GROUP VARIANCES 


It is clear that the power of the test depends on the type of heterogeneity to be detected. As 
an attempt to define the alternative hypothesis of heterogeneity one may specify different 
values of o?,¢ = 1,2,...,k for the k groups. In the case k = 3, Catherine M. Thompson (1937) 
has computed a few cases of power functions for Neyman & Pearson’s original criterion L,. 
These results were obtained by experimental sampling and checked against a Type III 
approximation to the distribution of 1/L, suggested by Wilks (1937) and based on his formulae 
for the first two moments of 1/,, which depend on the specified set of the o7 in the alternative 
hypothesis. Since in the computed examples the », differ, the criterion L, is not identical with 
M and the results are not immediately applicable. Certain further results on the behaviour 
of Z, when the hypothesis tested is not true can be found in papers dealing with its bias 
(Brown, 1939; Pitman, 1939; Bishop & Nair, 1939). Both Pitman and Brown derive 
formulae for the power of L,, and Brown gives short tables for the case k = 2 and five pairs 
of V4, V2. 

However, the main difficulty with the alterna%ive hypothesis in which the o7 are specified 
is that a complete investigation of the power would necessitate the evaluation of a (k— 1) 
parameter family* of power curves. 

* It can be shown that the power depends on the ratios 07/0] only. 


ns 


— Yo Ve 





H. O. Hartiey 311 


A simpler model of an alternative hypothesis may however be considered by analogy with 
what is known as the ‘random set up’ in Analysis of Variance power curve comparisons 
(Johnson, 1948). We assume that there is a population of normal distributions from which 
a sample of k distributions (¢ = 1, 2,...,4) has been drawn at random, and that, in turn, 
n=v+1 observations X,; (i = 1,2,...,n) have been sampled from each of these normal 
distributions, from which samples the s? are computed. Heterogeneity is therefore 
characterized by this population of normal distributions and, since the s? do not depend on 
their means, by the population g(S?) dS? (say) of the variances S? of these distributions. Under 
this assumption the joint distribution of s? and S? is given by 


F(8¢/S#) d(s3/S#) g(S?) dS7, 


where the function f is the distribution of y?/v for v degrees of freedom. Writing v, = s3/S?, 
the joint distribution of v, and S? is given by f(v,) dv,g(S?) dS}. The assumption of random 
heterogeneity can therefore be stated as follows: 

H,: The k sample mean squares s? are each the product of two factors; thus s? = v,S}, 
where ¥, is distributed as x?/v and S? follows an independent distribution g(S?) dS?. 

We shall now further assume that the distribution g(S?) is lognormal with variance 2*. 
Then the log-transforms E, = log, 6? = log», + log $3 (8) 
can be represented as the sums of two independent variates, one of which (log S*) is exactly 
normally distributed and the other (logy,) approximately, so that & = log,s? will be 
approximately normally distributed with variance* 


of = y'(4v) + 2?. 
The approximate power of the two criteria, M and F,,,, must now be evaluated. Since 
Froax. = exp (range of logs?), its power can clearly be obtained from the tables of the 


probability integral of the normal range (Pearson & Hartley, 1942). Now, with the help of 
suitable expansions of the logs in (1), it can be shown that approximately 


k m 
M~ > &-B 


Table 4. Approximate comparison of the power of the M- and Fy, -test 
































Power of M=1-—/ 
k 

0-99 0-95 0-90 0-80 0-50 
2 0-990 0-950 0-900 0-80 0-50 
3 0-990 0-950 0-899 0-80 0-49 
4 0-990 0-948 0-897 0-79 0-49 
5 0-989 0-945 0-892 0-79 0-48 
6 0-988 0-942 0-886 0-78 0-48 
8 | 0-986 0-935 ' 0-876 0-76 0-46 
10 0-984 0-927 0-864 0-75 0-44 
12 0-981 0-920 0-851 0-73 0-43 

| 





ad? ‘ 
* w'(4r) = a3 Be T(x) | g-iy is the exact variance of logv, and hence of logs} if Z* = 0. 











312 Maximum F-ratio as a short-cut test for heterogeneity of variance 


so that the power of M can be approximately obtained from tables of the y? distribution. We 
can therefore compute comparable values of the power of M and F,,,, as follows. We fix 
a at, say 0-05, and start from a given power of M (say) 1—f. From a table of percentage 
points of x? we obtain for any given k 


XalXi-~ = (v' (dv) + Z?)/p' (dy), (9) 


where the x? values are taken for k—1 degrees of freedom. Accordingly, the power of the 
Foax.-test corresponding to the same value of Z? is obtained by entering the table of the 
probability integral of normal range with W = Wyo, J/(x2_,/x2). In Table 4 we give a few 
comparisons. In the body of the table is given the power of the F,,,, -test for an alternative 
which the M-test is capable of detecting with a power given as the column heading (a = 0-05, 
k& = number of mean squares). 


5. THE CASE IN WHICH THE 33 ARE BASED ON DIFFERENT DEGREES OF FREEDOM 


No general claim can be made thai this general case is covered by the present short-cut 
procedure, but if the y, do not differ considerably Table 3 will still be found useful as a rough 
indicator of heterogeneity if entered with v as the mean of the »,. The approximations on 


which this procedure is based cannot be given here. In all cases of doubt, Bartlett’s test 
criterion must be evaluated. 


REFERENCES 


Bartiett, M. 8. (1937). Proc. Roy. Soc. A, 160, 268. 

Barrett, M. 8. & Kenpatt, D. G. (1946). J. R. Statist. Soc. Supovl. 8, 128. 
Bisuop, D. J. & Narr, U. S. (1939). J. R. Statist. Soc. Suppl. 6, 89. 
Brown, G. W. (1939). Ann. Math. Statist. 10, 119. 

Hartriey, H. O. (1940). Biometrika, 31, 249. 

Jounson, N. L. (1948). Biometrika, 35, 80. 

NeEyMAN, J. & Pearson, E. 8. (1931). Bull. Int. Acad. Cracovie, A, 460. 
Pearson, E. 8. & Hartiey, H. O. (1942). Biometrika, 32, 302. 
Pitman, E. J. G. (1939). Biometrika, 31, 200. 

Tuompson, C. M. & Merrineton, M. (1946). Biometrika, 33, 295. 
THompson, C. M. (1937). Biometrika, 29, 127. 

Wigs, 8. 8. (1937). Biometrika, 29, 124. 


~ 





— ea ae ~~ 








[ 313 ] 


TABLES OF THE ;?-INTEGRAL AND OF THE CUMULATIVE 
POISSON DISTRIBUTION 


By H. O. HARTLEY and E. S. PEARSON 


1. OTHER TABLES 


# As is well known, the y?-integral, the incomplete ['-function and the cumulative Poisson 


distribution* are all different forms of the same mathematical function. In the first con- 
nexion the integral denotes the chance that the sum of squares of v independent normal 
deviates exceeds a given level y?; it is given by 


o 


P(x?, v) = 2-* T(hv)4] eat de. (1) 


Whilst there are numerous tables of the inverse to (1) in the form of ‘percentage points of 
x?’, tables giving the probability integral P(x*, v) directly as a function of x? and v are not 
as numerous. We may mention here Elderton (1900), Pearson (1922) and Molina (1945). 
In Elderton’s table the y?-interval of 1 is too wide for small values of x? and v, while Molina’s 
table is really a table of the Poisson distribution and does not provide the probabilities of 
equation (1) for odd degrees of freedom. Karl Pearson’s (1922) tables of the incomplete 
I'-function, at present the most extensive tables of the integral, have the argument 
u = x?/,/(2v), that is, the ratio of distance from start of distribution to standard deviation. 
Whilst this has the advantage of standardizing the tabular ranges and intervals for all y, it 
has long been realized that the transformation to argument wu necessitates extra labour and 
often additional interpolation, particularly when one is concerned with evaluating the integral 
for a fixed value of x? and for a sequence of v. This last requirement occurs, for instance, 
when expanding frequency functions in a yx? or Laguerre series} and in certain applications 
to the power of the y?-criterion. 


2. THE PRESENT TABLE 


The accompanying table (pp. 318-25) is less extensive than either K. Peaison’s table of the 
incomplete ['-function or Molina’s table of the Poisson series, and must be regarded as a 
working table rather than a fundamental table of this function. It gives P(x?, v) to 5 places 
of decimals for 
v=1(1)20(2)70 degrees of freedom, and 
x2 = 0-000 (0-001) 0-01 (0-01) 0-1 (0-1) 2-0 (0-2) 10-0 (0-5) 20(1) 40 (2) 134. 

The difficulty of tabulating the integral at argument y? is that, for small values of x? and p, 
a small y?-interval is required, whilst with increasing x” and v both the y?-range as well as 
the required y?-interval extend, as indicated above. The objection that this frequent change 
in tabular interval would make interpolation in the table difficult has, we think, been 
adequately dealt with by the methods described in section 3. The tabular arrangement on 
separate top and bottom sections of 8 pages is a compromise between convenience and 
paginal economy. 


* For the relation between the y?-integral and the Poisson distribution, see equation (2) below. 
[t Some work on these lines by S. H. Khamis which it is hoped to publish in an carly issue of this 
journal was, indeed, one of the reasons for pushing forward the computation of the present table. Ep.] 














314 Tables of the y?-integral and of the cumulative Poisson distribution 


A further advantage of using the argument y? is that the table provides at the same time 
values of the cumulative Poisson distribution, since 


POs, v) = Yem mili! (2) 


with m = 4x" and c = 3v. In each column, corresponding to a tabular value of m (which 
is printed underneath the corresponding y?-argument (= 2m) at the head of each column), 
the cumulative frequencies of the Poisson distribution are shown in heavy type, whilst the 
entries P(x, v) for odd v (which are not required for the Poisson distribution) are distinguished 
in ordinary type. The argument cis shown in theright-hand margin of each page. The table does 
not go beyondc = 35(v = 70), and where, as in certain columns on pp. 318-20, the table stops 
at a lower value of c, all entries beyond this value are 1-00000 to 5-decimal accuracy. Thus it 
will be found that the complete Poisson distribution is provided for all m < 15, since for such 
m the cumulative frequencies beyond c = 35 are all 1. For m > 15, only the truncated Poisson 
sum up to c = 35 is shown. Clearly the table must be bounded, and it was decided to make 
it complete up to v = 70 for the more important y?-argument. 


3. METHODS OF INTERPOLATION 


(3-1) Single-entry interpolation y?-wise 
With no differences tabulated, Lagrangian interpolation or Aitken’s iterative method may 
be used. We describe here a somewhat simpler method based on a Taylor expansion which 
utilizes the relations aP . 
Ox? a i{P(x?, me 2) — P(x’, v)}, | 
(3) 
(x2)? a HP(x?, es 8) 2P(x?, v—2)+ P(x’, »,| 


Let x? denote the tabular argument nearest to the given value of y? for which the answer is 
required and let 0 = y?— x2. Then from the second order Taylor expansion we obtain 
2P(x", v) = P(xo,v—4) x ($0)? 
+ P(xg, v — 2) x (8 — 2(40)?) 
+P(xg,v) x (2-0 + ($0)?). (4) 
Example. To find P(3-64132,12) the following method is suggested: 
Copy down @ = 0-04132, form and copy ($0)? = 0-00043 and finally compute the sum of 
products P(3-64132,12) = ${0-89129 (0-00043) 
+ 0-96359 (0-04132 — 0-00086.) 
+ 0-98962 (2 — 0-04132 + 0-00043)} 
= 0-98907. (5) 
On a calculating machine the P-entries would be set as multiplicands and the terms inside 


the brackets ( ) applied in turn as multipliers, the sum of products in { } so formed is 
then immediately divided by 2. 


The 5th decimal computed from (4) may be a few units in error. Where 3-decimal accuracy 
is adequate, linear interpolation is usually sufficient and the 0? terms of (4) may be omitted. 




















neare 


For « 


On 
ap 














H. O. Hartiey anv E. S.-PEarson 315 
(3-2) Single-entry interpolation m-wise 
The method is almost identical with that of § 3-1. Let m, denote the tabular argument 
nearest to m and ¢ = m—™m,; then 
P(2m, 2c) = P(2m5, 2¢— al x 4g? 
+ P(2m,, 2c — 2) x (P — ¢?) 
+P(2m,, 2c) x(l1—¢+4¢?). (6) 


For details of computation, compare with § 3-1. 


(3-3) Double-entry interpolation 
When it is intended to use the present table for the evaluation of the y?-integral for 
fractional degrees of freedom (or odd degrees of freedom, v > 30) double-entry interpolation 
for both intermediate v and x? is necessary. Similarly, when using the present tables for 
computing the general Type III probability integral 


vol (1+7/a)’4 e-”" dr, (7) 


using the relations vy = 2ya+2 and y* = 2y(x+a), fractional v and y? will usually arise. 
In this case the relations (3) are particularly useful, as they allow the differential with 
regard to x? to be expressed in terms of the same entries, P(y3, v), which are required for the 
Lagrangian interpolation v-wise. 

Let x3 denote the tabular argument nearest to the given x?, and v, the tabular line heading 
nearest to, but smaller than the given v; then the following formulae and methods are 
suggested, for which we must distinguish two cases: 


(a) v>30. 
Let w = }(v—V9), 6 = $(x?— x2); compute and write down w¢ and w?, which are to be 


used in calculating the interpolate as the sum of products: 
P(x’, v)~ P(X5; Vg— 2) x x (0? — ju+o-w w) 
+ P(xz, V9) (1—w?— + 2w¢) 
+ P(x2, Vo + ae (4u? + $w—w9). (8) 
On a calculating machine, the multiplicands P are set and the terms within the brackets (_) 
applied in turn as multipliers. 
Example. Find P(25-2154,39). 
We have w = 0-5, ¢ = 0-1077, whence w? = 0-25, wd = 0-0538. Hence we obtain 
P(25-2154,39) = 0-91584 (0-125 — 0-25 + 01077 — 0-0538) 
+ 094815 (1 — 0-25 — 0-1077 + 0-1077) 
+ 096941 (0-125 + 0-25 — 0-0538) 
= 095728. (9) 
The exact interpolate is 0-95706. The main terms neglected in this formula are B” 6”, 2B"¢d" 


and }¢%8”, all 6’s being taken with regard to v. Of these terms, the first is usually the largest 
and may result in an error of a few units in the 4th decimal. If higher accuracy is needed, 











316 sae of the x?-integral and of the cumulative Poisson distribution 


a table of Lagrangian interpolation coefficients L_,, Z,, L, and L, may be used in conjunction 
with the following formula: 
P(x, v) = P(xG, Yo—2) x (Law) + —) 
| +P(XG,%) x (Lgl) —$ + 2w9) 
+ P(xXG, M9 + 2) x (L,(w) — wg) 
+ P(x2, v9 + 4) x L,(w). (10) 


With this aon the first of the above error terms is eliminated and the remaining error 


may sometimes amount to a few units in the 5th decimal, but is often-smaller than 5 x 10-*.. 


In the present| example formula (10) yields 0-95709. 


(6) v<30. | 
Let w = v—|v, and again ¢ = $(x?— x2). Compute and record w¢ and w?; then find the 
interpolate as the sum of products 
| P(x*, ») = P(xs M02) x (P-09) 
| + P(x, ¥9— 1) x (4? — 40+ wd) 
| + P(x}, %) x (1-w?-$ +) 
| + P(x2, ¥9 +1) x ($0? + 40 —w9). (11) 
For the operation on a calculating machine, see (a) above. 
Example. Find P(5-8764, 10-3148). 
We have w = 0-3148, ¢ = 0-0382, whence w? = 0-09910, wd = 0-01203, so that 
| P(5-8764, 10-3148) = 0-66962 (0-0382 — 0-0120) 
| + 0°75976 (004955 — 0-1574 + 0-01203) 
| + 0-83178 (1 — 0-0991 — 0-0382 + 0-01203) 
+ 0-88637 (0-04955 + 0-1574 — 0-01203) 
= 0-84508. (12) 
The main tems neglected in the above formula are B’3”, 2B"¢6” and 44746”, all 6’s being 
taken with regard to v. The error may be a few units in the 4th decimal, except near the 
singularity x? + 0, v = 0, where interpolation v-wise by this method is not possible.* If 
higher accuracy is required, a table of the 4-point Lagrangian coefficients may be used in 
conjunction with the following formula: 
P(x*, v) = P(xX§, ¥9—2) x (La(1+o)+¢—w¢) 
+ P(X§; Yo— 1) x (Lg(1 +o) + wd) 
+P(X§,%) = x (4(1+0)—$+9) 
+ P(x2, V9 + 1) x (Z,(1+) —w¢). (13) 
* For y*<1, instead of interpolating v-wise in the tabie we may compute any required value of 
P(x?, v) directly from the series 
P(x?, vy) =1—- gis ly +j+1) (m=},%, c=4y) 





which converges rapidly. A few values of the I’-function for fractional arguments are required for its 
evaluation. 








The 
p-wisé 
entry 
accur 


For t 
rende 
Mr 7 
made 
basec 


~— 


VweouluNSS 








H. O. Hartiey and E. 8S. Pearson 317 


(3-4) Single-entry interpolation v-wise 
The necessity for this type of interpolation does not arise very often, as interpolation 
y-wise usually occurs in conjunction with interpolation y?-wise, thereby leading to double- 
entry interpolation. The above formulae (10) and (13) may be used with ¢ = 0. For higher 
accuracy, a 6-point Lagrangian formula may be employed. 


4. COMPUTATION OF TABLES 


For the computation of the present tables we would like to acknowledge the expert help 
rendered by the Mathematics Division of the National Physical Laboratories. In particular, 
Mr T. Vickers has in no small way contributed to the method employed. Use was also 
made of a computational scheme outlined in an unpublished Thesis by 8. H. Khamis and 
based on the recurrence relations (3). 


REFERENCES 


Ex.perton, W. P. (1900). Biometrika, 1, 155. 
Pearson, K. (1922). Tables of the Incomplete I'-function. Cambridge University Press. 
Motrsa, E. C. (1945). Tables of the Poisson Exponential Limit. D. Van Nostrand and Co. 













































































x*=0-001 0-002 | 0-003 | 0-004 0-005 0-006 0-007 | 0-008 0-009 | 0-010 
vy |m=0-0005 | 0-0010 | 0-0015 | 00020 | 0-:0025 | 0:0030 | 0:0035 | 0-0040 0-0045 | 00050 | c v 
| 

1 | 0-97477 | 0-96433 | 0-95632 | 0-94957 | 0-94363 | 0-93826 | 0-93332 | 0-92873 | 0-92442 | 0-92034 1 
2 99950 | -99900| -99850/ -99800| -99750/ -99700| -99651| -99601| -99551  -99501| 1 2 
3 -99999 -99998 | -99996 | -99993 | -99991 | -99988 | -99984 | 99981 | ‘99977 | -99973 3 
4 | 99999: ‘99999 99999) 2 + 
| Vv 

6 

x? = 10-5 11-0 11-5 12-0 12-5 13-0 13-5 14-0 14-5 15-0 

vy | m= §-25 5-5 5-75 6-0 6-25 6-5 6:75 7-0 7:25 75 c v 
1 | 0-00119 | 0-00091 | 0-00070 | 0-00053 | 0-00041 | 0-00031 | 0-00024 | 0-00018 | 0-00014 | 0-00011 l 
2 “00525 00409 | -00318 | -00248 | -00193| -00150 | -00117; -00091 | -00071 | -00055/| 1 2 
3 01476 -01173 | -00931 | -00738| -00585| -00464| -00367/ -00291 | -00230] -00182 3 
4 03280 02656 02148 01735 01400 01128 00730 | -00586 | -00470} 2 4 
5 06225 05138 | -04232 | -03479| -02854 | -62338| -01912/ -01561 | -01273| -01036 5 
6 ‘10511 -08838 07410 06197 05170 03575 02452 | -02026| 3 6 
71 0-16196 | 0-13862 | 0-11825 | 0-10056 | 0-08527 | 0-07211 | 0-06082 | 0-05118 | 0-04297 | 0-03600 7 
8 ‘23167 ‘20170 | -17495 | -15120)| -18025| -11185| -09577 | -08177 | -06963 | -05915| 4 8 
9 31154 -27571 | -24299| -21331 | -18657| -16261 | -14126]| -12233| -10562} -09094 9 
10 39777 ‘85752 | -31991 | -28506 | -25299| -22367 | -19704| -17299| -15138| -13206| 5 10 
11 -48605 -44326 | -40237| -36364| -32726| -29333| -26190| -23299| -20655| -18250 ll 
12 57218 -52892 | -48662| -44568 | -40640/| -36904 | -33377) -30071 | -26992 24144] 6 12 
13 | 065263 | 0-61082 | 0-56901 | 0-52764 | 0-48713 | 0-44781 | 0-40997 | 0-37384 | 0:33960 | 0-30735 13 
14 “72479 ‘68604 | -64639 | -60630 | -56622 -52652 | -48759| -44971 41316 37815 | 7 14 
15 “78717 -75259 | -71641 | -67903| -64086| -60230| -56374| -52553| -48800| -45142 15 
16 83925 -80949 77762 | -74398 | -70890| -67276| -63591 | -59871 56152 52464] 8 16 
17 88135 -85656 | -82942| -80014| -76896| -73619| -70212| -66710| -63145| -59548 17 
18 ‘91436 -89436 87195 | -84724 79157 76106 | -72909 69596 66197 | 9 18 
19 | 0-93952 | 0-92384 | 0-90587 | 0-88562 | 0-86316 | 0-83857 | 0-81202 | 0-78369 | 0-75380 | 0-72260 19 
20 -95817 *94622 | -93221 91608 | -89779 | -87788 | -85492 | -83050 | -80427| -77641/ 10 20 
21 -97166 -96279 | -95214| -93962| -92513| -90862| -89010| -86960| -84718| -82295 21 
22 ‘98118 ‘97475 | - 95738 | -94618 | -93316 91827 90148 | -88279 | -86224] 11 22 
23 98773 ‘98319 | -97748 | -97047| -96201 | -95199| -94030| -92687| -91165| -89463 23 
24 -99216 ‘98901 ‘97991 | -97367 | -96612 95715 ‘93454 | -92076 | 12 24 
25 | 0-99507 | 0-99295 | 0-99015 | 0-98657 | 0-98206 | 0-97650 | 0-96976 | 0-96173 | 0-95230 | 0-94138 25 
26 ‘99696 ‘99555 | - ‘99117 | -98798 | -98397) -97902| - ‘96581 | -95733 }-13 26 
27 99815 -99724 | -99598 | -99429| -99208| -98925| -98567| -98125| -97588| -96943 27 
28 -99890 ‘99831 | -99749 | -99637 | -99487 | -99290| -99037 | -98719 97844 | 14 28 
29 -99935 -99899 | -99846| -99773| -99672| -99538| -99363] -99138| -98854| -98502 29 
30 0:99963 0-99940 | 0-99907 | 0-99860 | 0-99794 | 0-99704 | 0-99585 | 0-99428 | 0-99227 | 0-98974 | 15 30 
32 -99988 ‘99980 | -99968 | -99949 | -99922 | -99884  -99831  -99759 | -99664) -99539/ 16 32 
34 ‘99996 99994 | -99989 | -99983 , -99972 | -99957 | -99935 | -99904 | -99862 -99804/| 17 34 
36 ‘99999 ‘99998 | -99997 | -99994 ‘99991 | -99985  -99976 -99964  -99946 -99921/ 18 36 
38 -99999 99998 | -99997 | -99995 | -99992 ‘99987 | -99980| -99970| 19 38 
40 | | 0-99999 0-99998 | 0-99997 | 0:99996 | 0-99993 | 0-99989 | 20 40 
42 |  -99999 -99999 | -99998 | -99996 | 21 42 
44 | | | | | -99999 | -99999 | 22 44 
| | | | 46 

48 
50 

52 








loon 


wm a er a Oe OT 










































































x2=0-01 | 0-02 0-03 | 0-04 0-05 0-06 | 0-07 0-08 | 0-09 0-10 
v | m=0-005 | 0-010 | 0-015 | 0-020 | 0-025 | 0-030 | 0-035 | 0040 0045 | 0050 | c 
| it 

1} 0-92034 | 0-88754 | 0-86249 | 0-84148 det canes 0-79134 othe end 0-75183 

2 99501 -99005 | -98511 -98020| -97531| -97045  -96561| -96079 | -95600| -95123 

3 99973 -99925 | -99863 | -99790| -99707| -99616| -99518| -99412| -99301| -99184 

4 99995 | 99989 | -99980 | -99969 -99956  -99940| -99922 | -99902| -99879 

Fs -99999 | -99998 | -99997! -99995| -99993 | -99991 | -99987| -99984 

6 | 99999 09000 | -99999 | -99998 

| 
x2= 15-5 16-0 16-5 17-0 175 18-0 18°5 190 | 195 | 20-0 

v | m= 7-95 8-0 8-25 8-5 8-75 9-0 9-25 95 | 975 | 100 c 
1 | 0:00008 | 0-00006 | 0-00005 | 0-00004 | 0-00003 | 0-09002 | 0-00002 | 0-00001 | 0-00001 | 0-00001 

2 -00043 00034 | -00026 | -00020| -00016 | -00012/| -00010| -00008| - 00005 | 1 
3 -00144 00113 | -00090 | -00071 | -00056| -00044 | -00035 | -00027| -00022| -00017 

4 00377 00302 | -00242| -00193| -00154/| -00123 00079 | -00083 | -00050] 2 
5 -00843 00684 | -00555| -00450| -00364| -00295| -00238| -00192| -00155! -00125 

6 01670 01375 | -01181/| -00928| -00761 | -00623/ -00510| -00416| -00340 3 
7 | 0-03010 | 0-02512 | 0-02092 | 0-01740 | 0-01444 | 0-01197 | 0-00991 | 0-00819 | 0-00676 | 0-00557 

8 -05012 04288 | -08576 | -03011| -02530 | -02123/| -01777| -01486| -01240/ -01034| 4 
9 -07809 -06688 | -05715 | -04872| -04144| -03517| -02980| -02519| -02126| -01791 

10 ‘11487 09963 | -08619| -07436| -06401| 05496 -04709 03435 | -02925| 5 
11 -16073 -14113 | -12356| -10788| -09393| -08158| -07068| -06109| -05269| -04534 

12 -21522 ‘19124 | -16939| -14960| -18174| -11569| -10133 07716 | -06709 

13 | 027719 | 0-24913 | 0-22318 | 0-19930 | 0-17744 | 0-15752 | 0-13944 | 0-12310 | 0-10840 | 0-09521 

14 ‘34485 ‘81387 | -28380| -25618 | -23051| -20678| -18495| -16495| -14671/| -13014| 7 
15 -41604 38205 | -34962 | -31886| -28986| -26267| -23729| -21373| -19196| -17193 

16 -48837 45296 | -41864 | -38560 | -35398 | -32390| -29544 | -26866 | -24359| -22022| 8 
17 -55951 52383 | -48871 | -45437| -42102| -38884| -35797| -32853| -30060| -27423 

18 62740 59255 | -55770| -52811| -48902| -45565 | -42820| -39182| -36166/| -33282| 9 
19 | 0-69033 | 0-65728 | 0-62370 | 0-58987 | 0-55603 | 0-52244 | 0-48931 | 0-45684 | 0-42521 | 0-39458 

20 74712 71662 | -68516| -65297 | -62031| -58741| -55451| -52188| -48957| -45793 | 10 
2i 79705 -76965 | -74093 | -71111 | -68039| -64900| -61718| -58514| -55310| -52126 

22 83990 ‘81589 | -79082| -76386 | -73519| -70599| -67597 | -645383 | -61428 | -58304| 11 
23 87582 ‘85527 | -83304| -80925| -78402| -75749| -72983| -70122| -67185| -64191 

24 90527 ‘88808 | -86919 | -84866 | -82657| -80301| -77810| -75199  -72483 | -69678 | 12 
25 | 0-92891 | 0-91483 | 0-89912 | 0-88179 | 0-86287 | 0-84239 | 0-82044 | 0-79712 | 0-77254 | 0-74683 

26 -94749 93620 | -92341 |, -90908 (39320 | ‘87577 | -85683 | -83648 | -81464| -79156| 138 
27 96182 95295 | -94274| -93112| -91806| -90352| -88750}| -87000| -85107| -83076 

28 97266 -96582 | -95782| -94859| -93805| -92615| -91235| -89814/| -88200_ - 14 
29 98071 97554 | -96939 | -96218| -95383 | -94427| -93344] -92129| -90779| -89293 

30 | 0-98659 | 0-98274 | 0-97810 | 0-97258 | 0-96608 | 0-95853 | 0-94986 | 0-94001 | 0-92891 | 0-91654 | 15 
32 -99379 ‘99177 | -98925 | -98617 -98248| -97796 | -97269 | -96653 | -95941| -95126 | 16 
34 -99728 -99628 | -99500 | -99339 | -99137| -98889| -98588 | -98227| -97799| -97296 | 17 
36 -99887 -99841 | -99779| -99700| -99597| -9 99306 | -99107 | -98884| -98572 | 18 
38 -99955 99935 | -99907| -99870| -99821 | -99757| -99675 | -99572/| -99442 | -99281| 19 

| 

40 | 0-99983 | 0-99975 | 0-99963 | 0-99947 | 0-99924 | 0-99894 | 0-99855 | 0-99804 | 0-99738 | 0-99655 | 20 
42 -99994 99991 99986 | 99979 | -99969 99956 , 99938 | 99914 99882  -99841 | 21 
44 -99998 -99997 | -99995 | -99992 | -99988 | -99983  -99975 | -99964| -99949 | -99930 | 22 
46 -99999 -99999 | -99998 | -99997 | -99996 | -99993 | -99990 | -99986 | -99979| -99970 | 23 
48 -99999 | -99999 | -99998| -99998| -99996 | -99994| -99992| -99988 | 24 
50 099999 SE, ER 0-99998 | 0-99997 | 0-99995 | 25 
52 | | : -99999 | -99998 | 26 





















































x2=0-1 0-2 0-3 0-4 0-5 0-6 0-7 0-8 0-9] 1-0 
vy | m=0-05 0-10 0-15 0-20 0-25 0-30 0-35 0-40 0-45 0-50 c 
1 0-75183 | 0-65472 | 0-58388 | 0-52709 | 0-47950 | 0-43858 | 0-40278 | 0-37109 | 0-34278 | 0-31731 
2 95123 : 86071 | -81873 | -77880 | -74082| -70469| -67032| -63763 | -60653/; 1 
3 -99184 ‘97759 | -96003 | -94024 | -91839 | -89643 | -87320| -84947| -82543 | -80125 
4 99879 99532 | -98981 | -9 ‘97350 | -96306 | -95133 90980 | 2 
5 99984 ‘99911 | -99764 | -99533 | -99212| -98800| -98297 97703 | -97022 | -96257 
6 99998 99985 | -99950 99784 | -9964 7 | -98912 | -98561| 3 
7 0-99997 | 0-99990 | 0-99974 | 0-99945 | 0-99899 | 0-99834 | 0-99744 | 0-99628 | 0-99483 
8 99998 | -99994| -99987 | -99973 | -99953 | -99922 | -99880| -99825) 4 
9 -99999 | -99997 | -99993 | -99987 | -99978 | -99964 | -99944 
10 99999 | -99998 99994 | -99989 | -99983| 5 
ll 99999 | -99998 | -99997 | -99995 
12 99999 | -99999] 6 
| 
| 
x=21 22 23 24 25 26 27 28 29 30 
v | m=104 11:0 115 12-0 12-5 13-0 13:5 14-0 14:5 15-0 c 
1 0-00001 . 
2 00008 | 0-00002 | 0-00001 | 0-00001 1 
3 “00011 ‘00007 | -00004 | -00003 | 0-00002 | 0-00001 | 0-00001 
4 00032 00020 | -00013 | -00008 | -00005 | -00003 | -00002 | 0:00001 | 0-00001 | 0-00001 | 2 
5 00081 00052 | -00034 | -00022 | -00014; -00009 | -00006 | -00004| -00002 | -00002 
6 00184 00121 | -00080 | -00052 | -00084/| 00022 -0015| -00009 | -00006 | -00004/| 3 
7 0-:00377 | 0-00254 | 0-00171 | 0-00114 | 0-00076 | 0-00050 | 0-90033 | 0-00022 | 0-00015 | 0-00010 
8 00715 : 00336 | -00229| -00155 | -00105 | -00071 | -00047 | -00032 | -00021; 4 
9 01265 00888 | -00620 | -00430| -00297 00204 | -00140| -00095| -00065 | -00044 
10 02109 01511 | -01075 | -00760 | -00535 |- 00374 -00260 | -00181/ -00125;| -00086| 5 
11 03337 02437 | -01768 | -01273 00912 00649 | -00460 | -00324 | -00227| -00159 
12 03752 | -02773 | -02034| -01482;| -01073 | -00773 | -00553 | -00394/| -00279| 6 
13 0-07293 | 0-05536 | 0-04168 | 0-03113 | 0-02308 | 0-01700 | 0-01244 | 0-00905 | 0-00655 | 0-00471 
14 10163 0786 06027 | -04582 | -03457 | -02589 | -01925 | -01423| -01045| -00763) 7 
15 13683 “10780 | -08414| -06509 | -04994| -03802 | -02874| -02157/ -01609| -01192 
16 17851 14319 | -11374| -08950 | -06982| -05403 | -04148 | -03162| -02394| -01800| 8 
17 22629 *18472 | -14925| -11944| -09471! -07446 | -05807 04494 | -03453 02635 
18 27941 28199 | -19059 | -15503 | -12492| -09976 | -07900| -06206 03745 | 9 
19 0-33680 | 0-28426 | 0-23734 | 0-19615 | 0-16054 | 0-13019 | 0-10465 | 0-08343 | 0-06599 | 0-05180 
20 39713 84051 | -28880 | -24239| -20143/| -16581 | -18526 | -10940| -08776| -06985 | 10 
21 45894 39951 34398 29306 24716 | -20645| -17085 | -14015| -11400;| -09199 
22 52074 45989 | -40173 | -347238 | -29707| -25168| -21123 | -17568)| -14486/ -11846| 11 
23 -58109 52025 | -46077 40381 35029 | -30087 | +25597| -21578| -18031 | -14940 
24 63873 57927 | -51980 | -46160 | -40576| -35317 | -30445 | -26004| -22013| -18475 | 12 
25 0-69261 063574 | 0-57756 | 0-51937 | 0-46237 | 0-40760 | 0-35588 | 0-30785 | 0-26392 | 0-22429 
26 “74196 68870 | -68295 | -57597 | -51898| -46311/| - 85846 | -31108 | -26761 | 13 
27 *78629 ‘73738 | -68501 63032 | -57446| -51860| -46379 | -41097 36090 | -31415 
28 78129 | 73804 -68154| -62784/ -57305 | 51825 -46445| -41253 | -36322 | 14 
29 85915 82019 | -77654 72893 | -67825| -62549| -57171 51791 46507 | -41400 
30 | 088789 | 0-85404 | 0-81526 | 0-77208 | 072508 | 0-67513 | 0-62827 | 0-57044 | 0-51760 | 0-46565 | 15 
32 ‘93167 90740 | -87830 | -84442/ -80603 | -76361 | -71779 | -66986 | -61916 | -56809 | 16 
34 “96039 . 92360 | -89871| -86931 | -83549| ‘79755 -75592| -71121| -66412| 17 
36 97814 96781 | -95425 | -92703 | -91584/| -89047 | -86088 | -82720| -78972| -74886 | 18 
38 98849 98231 | -97383 | -96258 | -94815| -93017 | -90838| -88264| -85296 | -81947 | 19 
40 | 0-99421 | 0-99071 | 0-98568 | 0-97872 | 0-96941 | 0:95733 | 0-94213 | 0-92350 | 0:90122 | 0-87522 | 20 
42 99721 99533 | -99250| -98840| -98269/| -97499 | -96491 | -95209| -93622| -91703 | 21 
44 99871 99775 | -99623 | -99394 | -99060/ -98592 -97955 | -97116| -96038 | -94689 | 22 
46 99943 : 99818 | -99695 | -99509 | -99238 | -98854 | -98329/ -97630 | -96726 | 23 
48 99976 99954 | -99916 | -99853 | -99754)| -99603 | -99382 | -99067 | -98634| -98054 | 24 
50 | 0-99990 | 0:99980 | 0-99962 | 0-:99931 | 0-99881 | 0-99801 | 0-99678 | 0-99498 | 0-99241 | 0-98884 | 25 
52 99996 99992 | -99984/ -99969 | -99944 | -99903 | -99839 | -99789| -99592 | -99382 | 26 
54 99999 99997 | -99993 | -99987 | -99975 | -99955 | -99922 | -99869| -99789 | -99669 | 27 
56 99999 | -99997 | -99994| -99989 | -99980 | -99963 | -99937 | -99894/ -99828 | 28 
58 99999 | -99998 | -99995 | -99991 | -99983 | -99970| 99949 -99914| 29 
60 0-99999 | 0-99998 | 0-99996 | 099993 | 0-99986 | 0-99976 | 0:99958 | 35 
62 99999 | -99998 | -99997 | -99994/ -99989 | -99980 | 31 
64 99999 | -99999 | -99997 | -99995 | -99991 | 32 
66 99999 | -99998 | -99996 | 33 
68 ‘99999 | -99998 | 34 
70 0-99999 | 35 











ee 





oko 


NAWOa CM 


GS B2AarOD POON PAOGWM 
& SSSSS SRENSR ESE 





s 


21 





_— 





0-99999 





0-27332 
54881 
-75300 
87810 
94488 

| -97689 


0-99093 





99664 
-99882 
99961 
| *99987 
| 99996 
| 099999 


| 
| 


“00644 
01000 
01505 
02199 


0-03125 





0-25421 
“52205 
-72913 
86138 
-93493 
‘97166 

0-98844 
99555 
-99838 

99944 


-99981 
“99994 


0-99998 
99999 





| 33 
16-5 


0-00001 





| 








1-4 | 1-5 





0-70 = 0-75 
0-23672 | 0-22067 
49659 | -47237 
-70553 -68227 
84420 | -82664 
-92431 -91307 
96586 | -95949 
0-98557 | 0-98231 
99425 | -99271 
-99782 -99715 
-99921 | -99894 
-99973 -99962 
99991 | -99987 
0-99997 | 0-99996 
99999 | -99999 
34 | 35 
170 | 175 
0-00001 
0-00002 | 0-00001 
00004 | -000038 
-00009 -00006 
00019 | -00012 
-00036 -00025 
00068 | -00047 
0-00120 | 0-00085 
00206 | -00147 
-00341 -00246 
00543 | -00397 
-00840 | -00622 
01260 | -00945 
0-01838 | 0-01397 
02613 | -02010 
-03624 | -02824 
04912 | -03875 
-06516 | -05202 
08467 | -08840 
0-10791 | 0-08820 
‘13502 | -11165 
-16605 *13887 
-16987 
23926 | -20454 
0-28083 | 0-24264 
‘87145 | -82754 
-48774 | -42040 
-56402 | -51600 
-65496 | -60893 
0:73682 | 0-69453 
80548 | -76943 
86147 | -83185 
90473 | -88150 
93670 | 91928 
095935 | 0-94682 
97476 | -96611 
98483 | -97908 
99117 | -98750 
99275 
0-99727 | 0-:99593 
‘99855 | -99778 
-99925 | -99882 
-99963 | -99939 
-99982 | -99970 
0-99991 | 0-99985 














36 
18-0 


0-00059 
00104 
-00177 
00289 
00459 
00706 


0-01056 





0-99991 
-99997 
-99999 


37 
18-5 




















| 
8s | 419 2-0 
090 {| 0-95 1:00 c 
0-17971 | 0-16808 | 0-15730 
-40657 | +-38674| -36788| 1 
-61493 -59342 -57241 
‘77248 | -75414/| -73576 
-87607 *86280 | -84915 
-93714 91970} 3 
0-97008 | 0-96517 | 0-95984 
-98654 | -98393 | -98101| 4 
-99425 | -99295 | -99147 
‘99766 | -99705| -99634| 5 
-99908 | -99882 -99850 
‘99966 | -99954 | -99941] 6 
0-99988 | 0-99983 | 0-99977 
99996 | -99994| -99992| 7 
-99999 | -99998 | -99997 
99999 | -99999| 8 
| | 
we | ie 40 | 
190 | 19:5 20-0 c 
| 3 
0-00001 | 4 
-00002 | 0-00001 | 0-00001 
00004 | - 00002 | 5 
-00008 | -00005 | -00004 
00015 | -00011 | -00007| 6 
0-00029 | 0-00020 | 0-00014 
° 00086 | -00026| 7 
‘00090 | -00064 | -00045 
00151 60109 | -00078| 8 
-00246 00179 | -00129 
00387 ‘00209 | 9 
0-00593 | 0-00442 | 0-00327 
‘ -00667 | -00500 | 10 
01289 -00981 “00744 
01832 | -01411 | -01081 | 11 
02547 ‘01984 | -01537 
02731 | -02189 | 12 
0-04626 | 0-03684 | 0-02916 
06056 | -04875 | -03901 13 
-07786 | -06336 | -05124 | 
06613 | 14 
12234 10166 -08394 
0:14975 | 0-12573 | 0-10486 | 15 
‘21479 | 18398 | -15651 16 
29203 | -25497| -22107 | 17 
37836 | -338689 | -29703 | 18 
-46948 | -42461/ -38142 | 19 
0-56061 | 0-51514 |.0-47026 | 20 
64717 | -60842 | -55909 | 21 
“72550 | -68538 | 64370 | 22 
“79814 | -75804 | -72061 | 23 
‘84902 | -81963 | -78749 | 24 
0-89325 | 086968 | 0:84323 | 25 
-92687 | -90872 | -88782 | 26 
-95144 | -98800 | -92211 | 27 
‘96873 | -95914| -94752 | 28 
‘98046 | -97887 | -96567 | 29 
0-98815 | 0-98377 | 0-97818 | 30 
-99302 | -99021 | -98653 | 31 
-99600 | -99425 | -99191 | 32 
99777 | -99672 | -99527 | 33 
99879 | -99818 | -99731 | 34 
0-99986 | 0-99902 | 0-99851 | 35 


















































P=2-2 2-4 2-6 2-8 3-0 3-2 3-4 3-6 3-8 4:0 
v m=1-1 1-2 13 14 15 146 1:7 18 19 2-0 c 
1 0-13801 0-12134 | 0-10686 | 0-09426 | 0-08327 | 0-07364 | 0-06520 | 0-05778 | 0-05125 | 0-04550 
2 “%3287 30119 | -27253 | -24660| -223 20190 | -18268 | -16530| -14957| -13534; 1 
3 -63195 -49363 | -45749 -42350 39163 36181 +33397 -30802 +28389 -26146 
4 -69903 “66263 | -62682 | -59183 | -55783 49325 | -46284/| -43375| -40601 
5 *82084 *79147 *76137 -73079 -69999 66918 63857 -60831 ‘57856 | «54942 
6 -90042 ‘87949 | -85711/ -83350)| -80885 75722 | -73062| -70372| -67668| 3 
7 0-94795 093444 | 0-91938 | 0-90287 | 0-88500 | 0-86590 | 0-84570 | 0-82452 | 0-80250 | 0-77978 
8 97426 ‘96623 | -95691 | - ‘934386 | -92119| -90681; -89129| -87470| -85712) 4 
9 -98790 *98345 | -97807 ‘97170 -96430 95583 -94631 -93572 -92408 91141 
10 -99457 99225 | -98934/ -98575 | -98142| -97632| -97039 | -96359| -95592 | -94735 
ll -99766 -99652 | -99503 -99311 -99073 ‘98781 -98431 -98019 *97541 96992 
12 -99903 -99850 | -99777 -99396 | -99200| -98962 | -98678 6 
13 0-99961 0-99938 | 0-99903 | 0-99856 | 0-99793 | 0-99711 | 0-99606 | 0-99475 | 0-99314 | 0-99119 
14 -99985 ‘99975 | -99960 | -99938 | -99907| -99866 | -99813 | -99743 | -99655 | -99547| 7 
15 -99994 ‘99990 | -99984]| -99974| -99960]| -99940| -99913/| -99878| -99832| -99774 
16 -¥9998 -99996 | -99994| -99989 | -99983 | -99974| -99961 | -99944/| -99921| -99890/ 8 
17 -99999 -99999 | -99998 | -99996| -99993/ -99989| -99983| -99975 | -99964| -99948 
18 -99999 | -99998 | -99997 | -99995 -99989 | -99984 | -99976 
19 0-99999 | 0-99999 | 0-99998 | 0-99997 | 0-99995 | 0-99993 | 0-99989 
20 -99999 | -99999 | -99998| -99997 | -99995 | 10 
21 “99999 -99999 “99998 
22 -99999 | 11 
2— 42 44 46 48 50 52 54 56 58 60 
v m=21 22 23 24 25 26 27 28 29 30 c 
10 | 0-00001 5 
ll -00002 0-00001 
12 -00003 -06002 | 0-00001 6 
13 0-00006 0-00003 | 0-00001 | 0-00001 
14 00012 00006 | -00003 | -00001 | 0-00001 
15 -00023 ‘00011 -00005 -00003 -00001 | 0-00001 
16 00040 00020 | -00010 | -00005 | -00002; -00001 | 0-00001 
17 -00067 -00034 -00017 -00009 -00004 | -00002 -00001 | 0-00001 
18 00111 00058 | -00030 | -00015 | -00008 | -00004, -00002; -00001 9 
19 0-00177 0-00094 | 0-00050 | 0-00026 | 0-00013 | 0-00007 | 0-00003 | 0-00002 | 0-00001 
20 ‘ 00151 | -00081 | -00043 | -00022 | -00011 | -00006 | -00003 | -00001 | 000001 | 10 
21 -00421 -00234 00128 -00069 -00036 | -00019 00010 | -00005 -00003 -00001 
22 00625 00355 | -00198 | -00109 | -00059 | -00031  -00016 | -00009 -00004| -00002/ 11 
23 -00908 -00526 ‘00299 -00167 -00092 -00050 -00027 00014 | -00007 -00004 
24 01291 00763 | -00443 -00252, -00142; -00078 -00043 -00023 -00012; -00006 | 12 
25 0-01797 | 0-01085 | 0-00642 | 0-00373 | 0-00213 | 0-00120 | 0-00066 | 0-00036 | 0-00020 | 0-00011 
26 ‘ ‘01512 | -00912| - 00314 | -00180 -00102| - 00031 | -00017 | 13 
27 -03292 -02068 -01272 00768 -00455 -00265 | -00152 -00086 -00048 -00026 
28 04336 . 01743 | -01072;| -00647 -00384 | -00224| -00129, -00073 14 
29 -05616 03670 02346 01470 | -00903 -00545 | -00324 00189 00109 00062 
30 | 0-07157 | 0-04769 | 0-03107 | 0-01983 | 0-01240 | 0-00762 | 0:00460 | 0-00273 | 0:00160 | 0-00092 | 15 
32 ‘11108 07689 | -05200| -03440| -02229| -01417| -00884| -00543| - “00195 | 16 
34 ‘16292 ‘11704 | -08208 | -05627| -03775 | -02482, -01601/ -01014 -00387 | 17 
36 22696 ‘16900 | -12277| -08713 | -06048| -04111 | -02739) -01791| -01151/ -00727/ 18 
38 ‘30168 ‘23250 | -17477| -12828 | -09204| -06463 | -04446 01987 | -01293 | 19 
40 | 038426 | 0:30603 | 0-23771 | 0-18026 | 0-13358 | 0-09682 | 006872 | 0-04781 | 003263 | 0-02187 | 20 
42 -47097 “88691 | -31010| -24264)| -18549| -18867 | -10147| -07274/| -05114/| -03529/| 21 
44 -55769 -47164 | -38938 | -31393 | -24730/| -19048| -14357| -10599| -07669| -05444/ 22 
46 64046 ‘55637 | -47227| -39170| -31753 | -25172| -19525 | -14830| -11088| -08057 | 23 
48 71603 63742 | -55515| -47285 82094 | -25591 | -19981 | -15285 -11465 | 24 
50 | 0-78216 | 0-71172 | 0-68458 | 0-55400 | 0-47340 | 0-39593 | 0-32416 | 0-25990 | 0-20417 | 0-15724 | 25 
52 83770 ‘77710 | -70766 | -63191 | -55292)| -47392| -39786 | -32721 | -26371| -20836/ 26 
54 -88257 -83242 | -77230! -70382/| -62939| -55190| -47440| -39970| -33011 | -26734/| 27 
56 -91746 ‘87750 | -82737| -76774| -70019| -62700| -55094/ -47486| -40143 | -33287/ 28 
58 94363 ‘91291 | -87260| -82253 | -76340| -69674| -62475| -55003 | -47530/ -40308 | 29 
60 | 0-96258 | 0-93978 | 0-90848 | 0-86788 | 0-81790 | 0'75926 | 0-69347 | 0-62261 | 0-54917 | 0-47572 | 30 
62 97585 ‘95949 | -98598 | -90415 | -86331| -81345| -75531 | -69035 | -62058/ -54835/ 31 
64 ‘98483 ‘97347 | -95639 | -93224| -89993 | -85889 | -80917| -75153/ -68738| -61864| 32 
66 -99073 ‘98308 | -97106| -95330 | -92854| -89582) -85462| -80507/ -74792)| -68454/| 33 
68 -99448 ‘98949 | -98128 | -96862| -95022)| -92491| -89181| -85049| -80112/| -74445/| 34 
70 | 099680 | 0-99364 | 0-98819 | 0-97943 | 0-96616 | 0-94716 | 0-92134 | 0-88790 | 0-84649 | 0-79731 | 35 



































— 





—_ Pee LS SONA RON HO& OOo 






























































x?7=4-2 4-4 4-6 4-8 5-0 5-2 5-4 5-6 5-8 6-0 
v m=21 2-2 23 24 25 26 2-7 28 2-9 3-0 c 
1 0-04042 0-03594 | 0-03197 | 0-02846 | 0-02535 | 0-02259 | 0-02014 | 0-01796 | 0-01603 | 0-01431 
2 11080 | - 09072 | - 07427 | -06721| -06081 | -05502 | -04979; 1 
3 -24066 22139 | -20354 | -18704| -17180| -15772| -14474| -13278| -12176| -11161 
4 3 85457 | 33085 | -30844| -28730| -26739| -24866| -23108| -21459/| -19915| 2 
5 -52099 49337 | -46662 | -44077| -41588| -39196| -36904| -34711| -32617| -30622 
6 62271 | -59604| -56971| -54381| -51843| -49363 | -46945 | -44596| -42319' 3 
7 0-75647 0-73272 | 0-70864 | 0-68435 | 0-65996 | 0-63557 | 0-61127 | 0-58715 | 0-56329 | 0-53975 
8 ‘83864 81935 | - ‘77872 | -75758 | -73600| -71409| -69194| -66962| -64723| 4 
9 -89776 *88317 | -86769 | -85138 83431 | -81654 | -79814 77919 | -75976 | -73992 
10 93787 92750 | -91625 | -90413 | -89118| -87742| -86291/| -84768| -83178| -81526| 5 
ll -96370 *95672 | -94898 | -94046| -93117| -92109| -91026 89868 | -88637 87337 
12 97955 97 97002 | -96433 | -95798 | -95096 | -94327 92583 | -91608/ 6 
13 0-98887 0-98614 | 0-98298 | 0-97934 | 0-97519 | 0-97052 | 0-96530 | 0-95951 | 0-95313 | 0-94615 
14 99414 99254 | -99064/ -98841 | -98581 | -98283| -97943/ -97559| -97128| - 
15 -99701 -99610 | -99501 | -99369 | -99213) -99029 98816 | -98571 | -98291 | -97975 
16 99851 99802 | -99741 | -99666 | -99575 | -99467 99187 | -99012 | -98810/ 8 
17 -99928 -99902 | -99869| -99828| -99777 | -99715 99639 | -99550 | -99443 | -99319 
18 99966 99953 | -99936 | -99914| -99886 | -99851 ‘99757 | -99694 9 
19 0-99985 0-99978 | 0-99969 | 0-99958 | 0-99943 | 0-99924 | 0-99901 | 0-99872 | 0-99836 | 0-99793 
20 -99993 99990 | -99986 | -99980 | -99972| -99962| -99950| -99934)| -39914/ -99890 | 10 
21 -99997 “99995 | -99993 | -99991 | -99987| -99982| -99975 | -99967| -99956 | -99943 
22 99999 99998 | -99997 99994 | -99991 | -99988| -99984| -99978 | -99971| 11 
23 -99999 -99999 | -99999 99998 | -99997| -99996/ -99994| -99992| -99989 | -99986 
24 99999 99999 | -99998 | -99997' -99996 | -99995 | -99993 | 12 
25 0-99999 | 0-99999 | 0-99999 | 0-99998 | 0-99998 | 0-99997 
26 99999 | -99999 | -99998 | 13 
27 -99999 | -99999 

x? = 62 64 66 68 70 72 74 76 78 80 
v m=81 32 33 34 35 36 37 38 39 40 c 
21 0-00001 
22 00001 | 0-00001 ll 
23 -00002 ‘00001 | 0-00001 
24 00003 00002 | -00001 12 
25 0-00006 0-00003 | 0-00002 | 0-00001 
26 00009 00005 | - 00001 | 0:00001 18 
27 -00014 “00008 00004 | -00002 -00001 | 0-00001 
28 00023 00012 00004 | -00002 | -00001 | 0-00001 14 
29 00035 “00019 00011 | -00006 |} -00003 | -00002/ -00001 
30 0-00052 | 0-00029 | 0-00016 | 0-00009 | 0-00005 | 0-00003 | 0-00001 | 0:00001 15 
32 00114 00066 | -00038 | -00021 | -00012{ -00007 | -00004| - 0-00001 | 0:00001 | 16 
34 00234 00139 | -00082 | -00047 | -00027 | -00015 | -00009 | -00005 | -00003 , -00001 | 17 
36 00452 00277 | -00167 | -00100 -00059 | -00034 -00020 | -00011| -00006  -00004/ 18 
38 00828 00522 | -00324| -00198| -00120/ -00071 | -00042 -00025| -00014, -00008/ 19 
40 0:01441  0-00934 | 0-00596 | 0-00375 | 0:00233 | 0-00142 | 0-00086 | 0-00051 | 0:00030 | 0-00018 | 20 
42 02392 01594 | -01045 | -00675 | -00430 | -00270 | -00167| -00102  -00062  -00087 | 21 
44 03795 02600 | -01751 -01161| -00758 | -00488 | -00310  -00194 | 00120 -00073 | 22 
46 05772 04062 | -02810| -01912/ -01281 -00846 -00550 | -00353 | -00224, -00140| 23 
48 08437 ‘06097 | -04329| -03023 -02077 -01405| -00987 | -00616  -00399 | -00256 | 24 
50 011880 | 0-08810 | 0-06418 | 0-04596 | 0:03237 | 0-02245 | 0-01533 | 0-01032 | 0:00685 | 0-00448 | 25 
52 16148 ‘12283 | -09175 | -06736| -04862| -03453 | -02415 | -01664{| -01130 | -00757 | 26 
54 21237 ‘16557 | -12675  -09533 | -07049| 05127 -03670 | -02587| -01797 | -01231 | 27 
56 27080 ‘21623 | -16953 | -13057| -09884| -07358 | -05390 | -03888 | -02762 | -01934 | 28 
58 33550 27412 | -21994| -17885 | -13428 | -10227!| -07663 | -05652 | -04105 | -02938 | 29 
60 | 0-40465 | 0-33801 | 0:27730 | 0-22851 | 0-17705 | 0-13789 | 0-10563 | 0-07964 | 0-05912 | 0-04323 | 30 
62 47611 40615 | -34040| -28035| -22694 -18063/ -14140/| -10893 | -08260 -06169 | 31 
64 54757 -47649 | -40758 | -34270 | -28328 | -23026| -18409 | 14482, -11215 | -08552 | 32 
66 61680 ‘54683 | -47685 | -40894 | -34490 | -28609| -23846| -18745| -14816| -11530/ 33 
68 68183 ‘61504 | -54612| -47719| -41025| -34700| -28880| -23654, 19071 15140 | 34 
70 | 0-74112 0-61335 | 0-54544 | 0-47752 | 0-41150 | 0-34908 | 0-29141 | 0:23953 | 0-19388 | 35 





| 0-67928 





























+ 


| 


COo-] Qurwndse 













































































] 
x? =6-2 6-4 6-6 68 | 7-0 7-2 7:4 | 7-6 18 8-0 
m=31 3-2 3:3 34 | 385 3-6 3-7 38 3-9 40 c 
! 
0-01278 | 0-01141 | 0-01020 | 0-00912 | 0-00815 | 0-00729 | 0-00652 | 0-00584 | 0-00522 | 0-00468 
-04505 04076 | -03688 | -03337 | -03020| -02732, -02472| - : 0 1 
-10228 09369 | -08580 | -07855| -07190| -06579| -06018| -05504| -05033| -04601 
‘18470 ‘17120 | -15860 | -14684/ -18589 | -12569 11620 | -107388 | -09919| -09158| 2 
-28724 -26922 | -25213| -23595 | -22064| -20619| -19255| -17970| -16761| -15624 
40116 37990 | -35943 33974 | -32085 | -30275 26890 | -25313 | -23810; 3 
0:51660 | 0-49390 | 0-47168 | 0-45000 | 0-42888 | 0-40836 | 0-38845 | 0-36918 | 0-35056 | 0-33259 
60252 | -58034| -55836 | :53663 | -51522| -49415 | -47349| -45325 | -43347) 4 
“71975 -69931 | -67869 | -65793| -63712| -61631| -59555| -57490| -55442| -53415 
“79819 78061 | -76259 | -74418| -72544| -70644| -68722, -66784 | -64837 | -62884/ 5 
85969 84539 | -83049| -81504| -79908| -78266| -76583| -74862| -73110| -71330 
9 | -88288 | -87054| -8576i | -84412) -83009| -81556/ -80056 | -78513| 6 
0-93857 | 0-93038 | 0-92157 | 0-91216 | 0-90215 | 0-89155 | 0-88038 | 0-86865 | 0-85638 | 0-84360 
-96120 95538 | -94903 | -94215| -93471| -92673| -91819| -$0911| -89948 | -88933/| 7 
-97619 ‘97222 | -96782| -96296| -95765| -95186| -94559| -93882 | -93155| -92378 
‘98579 ‘98317 | -98022| -97693 | -97326| -96921 | -96476| -95989  -95460| -94887| 8 
-99174 -99007 | -98816| -98599| -98355| -98081| -97775| -97437| -97064| -96655 
99532 -99429 | -99309 99171 | -99013 | -98833 ‘98147 | -97864| 9 
0-99741 | 0-99679 | 0-99606 0-99521 | 0-99421 | 0-99307 | 0-99176 | 0-99026 | 0-98857 | 0-98667 
-99860 4 99781 | -99729| -99669 | -99598 | -99515 | -99420| -99311 | -99187/ 10 
-99926 “99905 | -99880 | -99850! -99814| -99771 | -99721 | -99662! -99594| -99514 
-99962 ‘99950 99919 | -99898 | -99873 | -99843 | -99807 997 ‘99716 | 11 
-99981 -99974 | -99967 | -99957| -99945) -99931 | -99913| -99892;, -99867| -99837 
‘99990 ‘99978 | -99971 | -99963 | -99953 | -99941 12 
0-99995 | 0-99994 | 0-99991 | 0-99989 | 0-99985 | 0-99981 | 0-99975 | 0-99968 | 0-99960 | 0-99949 
-99998 ‘99997 | -99996 | -99994; -99992 | -99990 -99987 | -99983 -99973 | 13 
-99999 -99999 | -99998 | -99997| -99996| -99995| -99993| -99991 | -99989| -99985 
| ‘99999 | -99999  -99999 | -99998 | -99998 | -99997 | -99996 ‘99992 | 14 
-99999 | -99999| -99999| -99998| -99998| -99997]| -99996 
0-99999 | 0-99999 | 0-99999 | 0-99999 | 0-99998 | 15 
| | 
| | | | | 
x2=82 | 84 86 88 90 92 94 | 96 98 100 
m=41 42 | 43 44 45 46 47 | #48 49 50 c 
Sanat verde —sopet-suabe porte sere} : 
0:00001 17 
00002 | 0:00001 18 
00004 -00002 | 0-:00001 | 0-00001 19 
0:00010 | 0:00006 | 000003 | 0-00002 | 0-00001 0-00001 20 
-00022 ‘00013 | -00007' -00004 | -00002 | -00001 | 0-00001 21 
00044 00026 | -00016 -00009 -00005 -00002{ -00002 | 0-00001 22 
00086 00053 | -00632 | -00019 | -00011 -00007 | -00004  -00002 | 0-00001 | 000001 | 23 
-00162 00101 | -00062 | -00038 | -00023 00014 -00008| -00005 | -00003 | -00002 | 24 
000290 | 0:00185 | 0-00117 | 0-00073 | 0-00045 | 0-00027 | 0-00017 | 0-00010 | 0-00006 | 000003 | 25 
-00500 ° ‘00210 | -00134| -00084| -00053 4 00012 | -00007 | 26 
00832 00555 | -00365 | -00238 | -00153 -00097 | 1 | -00038 | -00023 -00014/} 27 
01335 00910 | -00612 | -00406 | -00267| -00173 | -00111 -00070| 00044 -00027/ 28 
62073 01442 | -00990 | -00671  -00450| -00298 | -00195| 00126 -00081  -00051/ 29 
003115 | 0-02214 | 0-01552 | 001074 | 0-00734 | 0-00495 | 0-00330 | 0-00218 | 0-00142 | 0-00092 | 30 
“04540 , 02357 | -01664/ -01160| -00798 | -00543 | -00365| - ‘00159 | 31 
06425 04757 | -03473 -02502 | 01778 | -01248 | -00865 | -00592 | -00401 | -00269 | 32 
-08839 04974 | -03653 -02648 | -01894| -01338 | -00934  -00644 -00439/| 33 
‘11839 09122 | -06928 -05189 | 03834! -02795| -02012| -01431| -01005| -00698 | 34 
0-15457 0-12142 | 0-09401 | 0-07176 | 0-05404 | 0-04015 | 0-02944 | 0-02132 | 0-01525 | 0-01078 | 35 
| | | | 














Se 


14 


15 


KH BOOHS SHIAR PBOOES 





















































x?=8-2 8-4 8-6 8-8 9-0 9-2 9-4 9-6 9-8 10-0 

v m=41 42 43 44 45 46 47 48 49 5-0 c 

1 0-00419 | 0-00375 | 0-00336 | 0-00301 | 0-00270 | 0-00242 | 0-00217 | 0-00195 | 0-00175 | 0-00157 

2 01657 01500 | -01357 | -0 ‘O1111 | -01005 | -00910/ -00823 -00745| -00674 

3 04205 -03843 | -03511 | -03207 | -02929| -02675| -02442 | -02229| -02034)| -01857 

4 08452 07798 | -07191 | -06630 | -06110 05184 | -04773 

5 *14555 *13553 | -12612 | -11731 | -10906 | -10135| -09413| -08740| -08110| -07524 

6 -22381 21024 | -19736 | -18514| -17858| -1 15230 | -14254 

7 0-31529 | 0-29865 | 0-28266 | 0-26734 | 0-25266 | 0-23861 | 0-22520 | 0-21240 | 0-20019 | 0-18857 

8 -41418 39540 | - 35945 | - 82571 | -30968 27985 | -26503] 4 

9 -51412 -49439 | -47499 | -45594/ -43727| -41902 | -40120| -38383 | -36692) -35049 
10 -60931 57044 -55118 | -53210| -51823 | -49461 | -47626| -45821| -44049] 5 
11 *69528 *67709 | -65876 | -64035 | -62189| -60344 | -58502| -56669 | -54846| -53039 
12 76931 “75814 ‘71991 | -70293 | -68576 | -66844| -65101 | -63350| -61596/ 6 
13 | 0-83033 | 0-81660 | 0-80244 | 0-78788 | 0-77294 | 0-75768 | 0-74211 | 0-72627 | 0-71020 | 0-69393 
14 87865 86746 | -85579/| - 83105 | ‘8 80461 | -79081| -77666| -76218| 7 
15 91551 *90675 | -89749 | -88774| -87752| -86683 | -85569| -84412/ -83213| -81974 
16 94269 % 92142 | -91341 89603 | -88667| -87686 | -86663 
17 96208 95723 | -95198 | -94633 | -94026| -93378| -92687 | -91954| -91179| -90361 
18 97551 95974 94974 | -94418 | -93824/ -93191 
19 | 0-98454 | 0-98217 | 0-97955 | 0-97666 | 0-97348 | 0-97001 | 0-96623 | 0-96213 | 0-95771 | 0-95295 
20 99046 98887 | -98709 | -98511 | -98291 | -98047| -97779| -97486| -97166| -96817 | 10 
21 -99424 *99320 | -99203 | -99070| -98921 | -98755 | -98570| -98365 | -98139| -97891 
22 -99659 99593 | -99518 | -99431 | -99333 | -99222/ -99098| -98958 | -98803 | -98630/| 11 
23 99802 ‘99761 | -99714| -99659 | -99596 | -99524| -99442/ -99349| -99245/ -99128 
24 99888 99863 | -99833 | -99799 | -99760| -99714/| -99661 | -99601 | -99532) -99455 
25 | 0-99937 | 0-99922 | 0-99905 | 0-99884 | 0-99860 | 0-99831 | 0-99798 | 0-99760 | 0-99716 | 0-99665 
26 99966 99957 | -99947/ - 9991) 99902 | -99882  -99858 | -99830 | -99798 
27 -99981 -99977 | -99971 99963 | -99955 | -99944 | -99932| -99917) -99900| -99880 
28 -99990 99987 | -99984 99975 | -99969 | -99962 | -99953 | -99942| -99930/| 14 
29 99995 -99993 | -99991 99989 | -99986 | -99983 | -99979| -99973 -99967)| -99960 
30 | 0-99997 | 0-99997 | 0-99996 | 0-99994 | 0-99993 | 0-99991 | 0-99988 | 0-99985 | 0-99982 | -99977 | 15 
32 99999 -99999 | -99999  -99998 | -99998 | -99997 | -99997 | -99996 | -99995 | -99993 | 16 
34 99999 | -99999 | -99999 | -99999 | -99998 | 17 

x? = 102 104 106 108 110 112 114 116 118 120 

v m= §1 52 53 54 55 56 57 58 59 60 c 
48} 0-00001 24 
50 | 0-00002 | 0-00001 | 0-00901 25 
52 00004 00002 | -00001 | 0-00001 26 
54 00008 00005 | -00003 | -00002 | 0:00001 | 0:00001 27 
56 00017 00010 ; -00006 | -00004, -00002 | -00001 | 0-00001 28 
58 00032 00020 | -00012 | -00007 | -00004 -00003 - 0-00001 | 0-00001 29 
60 | 0-00059 | 0-00037 | 0:00023 | 0-00014 | 0:00009 | 0-00005 | 0-00003 | 0:00002 | 0-00001 | 0.00001 | 30 
62 00104 00067 | -00043 | -00027 | -00017 | -00010 | -00006 | -00004 -00002, -00001/ 31 
64 00178 00117 | -00076 | -00049 | -00031 | -00019 | -00012 | -00008 | -00005 | -00003 | 32 
66 00296 00198 | -00130 | -00085 | -00055 | -00035 | 00022 -00014 | -00009 | -00005 | 33 
68 00479 00325 | -00219 | -00145 | -00095 | -00062  -00040 | -00026 -00016 | -00010 | 34 
70 | 0-00753 | 0-00521 | 0-00356 | 0-00241 | 0-00161 | 0-00107 | 0-00070 | 0:00046 | 0-00029 | 0-00019 | 35 

x?= 122 124 126 128 130 132 134 

v | m= 61 62 63 64 65 66 67 
62 | 0-00001 31 
64 00002 | 0-00001 | 0-00001 82 
66 00003 00002 | -00001 | 0:00001 33 
68 00006 00004 | -00002 | -00002 | 0-00001 34 
70 | 0-00012 | 0-00008 | 0:00005 | 0-00003 | 0:00002 | 0:00001 | 0:00001 | 35 














[ 326 ] 


ON A SEQUENTIAL ?t-TEST 


By 8S. RUSHTON 
Imperial College 


1. INTRODUCTION 


It happens, not infrequently, that we are given a variate X and a fixed number a, and we 
wish to know whether the probability, Pr. {X <a}, that X will be less than «, is equal to one 
value p, or to another value p’. For example, X might be the tensile strength of a steel rod, 
and @ a limit below which we would not wish the tensile strength to fall; a rod whose tensile 
strength was less than a would then be classified as ‘defective’, and Pr. {X <a} would then 
be the proportion of defective rods in a large batch. We might want to know whether this 
proportion of defective rods was equal to a low value p, or to a relatively high value p’. Such 
a question can be answered directly by using a sequential test based on a series of observations 
each of which consists in observing whether or not the tensile strength of a given rod is less 
than a. The theory of such a test is simple, and the practical procedure has been given by 
Wald (1947) and Barnard (1945). A test like this, however, does not make use of the fact that 
the tensile strength can be measured more or less exactly, and can often be assumed to be 
normally distributed in a large batch. If we make the assumption that X is normally 
distributed with mean y and standard deviation a, then Pr. {X <a} = ®((a—y)/o), where 
(zx) is the cumulative distribution function for the standard normal distribution. Further, 
we can always shift the origin of measurements to «, by subtracting a from all the readings, 
and then the probability in which we are interested becomes Pr. {X <0} = ®(—4), where 
6 = y/o. If we put p = O(—4) and p’ = O(—4d’), then the question we are asking about 
p versus p’ becomes the question whether ~/o = 6 or whether u«/o = 6’. If o happens to be 
known, there is a simple sequential test available for settling this question, given in Wald 
(1947). We shall be concerned in this paper with the case where a is not known. A sequential 
test is available for this case also, but its use involves a rather complicated function. The main 
object of this paper is to give an approximation to this function which will enable the test 
to be carried out in practice. 


2. THE LIKELIHOOD RATIO TEST CRITERION 


If we wish to test a hypothesis # against a hypothesis #’, with risks of error « when #is 
true and # when #’ is true, the general theory of sequential tests tells us to take observations 
sequentially and at each stage calculate the likelihood ratio /. If 


Ai(1—a)<1<(1—A)/a, 


then we continue to take observations. If the right-hand inequality is broken we accept #’, 
while if the left-hand inequality is broken we accept #. In our case the hypothesis # states 
that X is normally distributed with unspecified standard deviation o and mean yu = do, 
6 being specified, while #’ states that X is normally distributed with unspecified standard 








—_———_—_S oo SS 


ase 


d we 
) one 
rod, 
nsile 
then 
this 
Such 
jions 
less 
n by 
that 
Oo be 
ally 
here 
ther, 
ings, 
here 
bout 
0 be 
Vald 
atial 
nain 
test 





ee 


ae 





S. RusHtron 327 


deviation o and mean yu = d’o, 6’ being specified. If n observations 2,(i = 1, 2,...,n) have 
been taken, then the likelihood function A is 


I (1/ y(2m) o)exp(-“—2") = (1/y(2m)o)*exp| —“= ae : 








20? 20? 
= A(x.,8|",o), say, 
where nx. = x; and (n—1)s* = ¥ (x;—z.)*. The fact that A involves z. and s only, and is 


otherwise independent of the observations, shows that x. and s ave jointly sufficient statistics 
for «and o. Further, the hypotheses ¥ and #’ are ‘invariant under change of scale’, in the 
sense that if H is true about X, and we put Y = aX, where a is any fixed positive number, 
then Y will be normally distributed with unspecified standard deviation ac, and mean 
ap = dao, § having the value specified in #; in other words, if # is true about X, then it is 
also true about Y = aX. The same holds good for #’. And under the transformation 
X -+aX, o becomes ag and s becomes as, so that any given value of o can be changed into 
any other given value by a suitable choice of a, and the same holds good for s. Finally, the 
ratio t = x. ./n/s is unaltered by a transformation X->aX. Stein (unpublished) and, after 
him, Barnard (unpublished), have shown that under these circumstances we can obtain 
a sequential test of # against #’ by considering only the distribution of t. On #, t has the 
non-central ¢ distribution on (n—1) degrees of freedom with parameter 6, the probability 
density function (Johnson & Welch, 1940) being 





I'(n) exp [ — 4n(n— 1) d?/(n—1+8)] (/ n—-1 \* 
$(t| 8,”) = Sana) T(4(n—1)) Jr J(n—1) (= i zal Hh,,_,(— du), (1) 
where u = t /{n/(n—1+#)} (2) 
and Hh, (x) = } “(2m exp[—4(2+2)*] dz (3) 
0 


is the Hh-function discussed and tabulated in Airey (1931). On #’, t has the probability 
density function ¢(t| 6’, ), so that the likelihood ratio test criterion is 
L(t | 8, 8’) = A(t | 6’, n)/A(t| 6, n). (4) 

In practice it is easier to run the sequential test procedure in terms of the logarithm of the 
likelihood ratio, i.e. to calculate log /,(¢| 6, 6’) at each stage, and if 

log £/(1—«) <log!,,(t | 8,8’) < log (1—)/a, (5) 
we take a further observation, while if the right-hand inequality is broken we accept #”, and 
if the left-hand inequality is broken we accept #. Thus the problem of the practical carrying 
out of the test reduces to the problem of evaluating log/ = logl,(t| 8, 6”) as a function of t, 
for each value of n. 

3. APPROXIMATION FOR LOG / 

In practice it is easier to work with u than with ¢, since from (2) we find 


w= Dal y(Z2), (6) 


and wu can thus be calculated directly provided at each stage we keep a record of the 
cumulative sum 2; and the cumulative sum of squares > 27. In terms of uw, making the 
i i 


substitution from (1) into (4), after some reduction we find 
log] = g,,(8’u) — gn (du) — $n (0° — 6), (7) 








328 On a sequential t-test 


where 9,(x) = 42? + log y,,(x) (8) 
and Yn(%) = Hh,_,(—2)/Hh,_,(0). (9) 
It is known (Airey, 1931) that y,,(x) satisfies the differential equation 

Yn + xy, —(n—1)y,, = 9, 
which becomes, putting y,, = zexp[— }2*], 

2” —4(a?+ 4n—2)z = 0. (10) 
Applying the method of Horn (Jeffreys, 1925; Levy & Baggott, 1934, pp. 230 ff.) to (10) we 
may derive an asymptotic solution for large n. Denoting ./n = h, we assume a solution of 
(10) in the form — (2) = y(x) exp [hO(x)] {1 +104(x)/h + w4(2)/h? + ...}. 
Differentiating twice, substituting in (10), and equating coefficients of powers of h to zero, 
we find @Py-7 = 0, Oy + 26'y' = 0, "+ 28"ywi — (Hat—2))y = 0, 
which give O(x)=+2, y(z)=c and w,(x) = +e)(x*- 62). 
The approximate solution of (10) is therefore 

z(x) = A exp [ha] {1 + (x — 6x)/(24h)} + Bexp [ — ha] {1 — (x? — 6x)/(24h)} 


as far as the terms in h-!. The arbitrary constants A and B can be determined from the 
boundary conditions y,,(0) = 1 and 


Yn(0) = J20(4(n + 1))/T(4n) ~ yn, 
from which it follows that A = 1 and B = 0. Substituting in (8), to terms in 1/,/n, we find 
Gn(%) = 3x? + fnae{1 — 1/(4m) + 2?/(24n)}. (11) 
We shall refer to the expressions Gyltt) & fe? + sina, (12) 
In(X) = 4x? + a/na{1 — 1/(4n)}, (13) 
as the first and second approximations respectively for g,(z), and to (11) as the third 
approximation. 


4. ACCURACY OF THE APPROXIMATION 


For the accurate computation of g,,(x) the existing tables of Hh,,(x) are of little use because 
the range covered is unsuitable. It is better to make use of the fact that the Hh-function is 
closely related to the confluent hypergeometric function M(a, y, x), defined by 


May,2) = ¥ Ce) Pa+iPa) Py +i yeti 


In fact g(x) = log {M(4n, 3, }a*) + /2aM(4(n + 1), $, 42%) P(H(m + 1))/T(4n)}. 

Tables of the confluent hypergeometric function have been computed, and from them 
accurate values of g,,(x) have been obtained for values of x ranging from —10 to 10 and 
values of n from 2 to 200. These have been checked, as far as possible, by means of the tables 
in Airey (1931). Considerations of space forbid the reproduction here of these extensive 
tabulations, but some idea of the accuracy of the approximation can be obtained from 
Tables 1 and 2. In Table | values are given for the logarithm of the likelihood ratio for a set 
of values of n, 3, 3’ and x = du, calculated respectively from (i) the confluent hypergeometric 
function, (ii) the first approximation (12), (iii) the second approximation (13), and (iv) the 
third approximation (11). Since 6 = 0 in this table, the actual formulae used in calculating 


S. RusHtTon 329 5 


(8) the approximations are those given in the equations referred to, with the subtraction of 
(9) 4nd. In Table 2 we have taken 6 = — 6’. In this case the last term in (7) disappears, and the 
: likelihood ratio is a function of x = du only. 


Table 1. Approximation to logl (case 6 = 0) 


(10) | = 0-2 x=1-0 2=2-0 x= 4-0 
10) we 
‘ n=2 — 3-740 — 2-501 — 0-384 6-305 
tion of | é=0 — 3-707 — 2-336 —0-172 5-657 
{ =2 — 3-742 — 2-513 — 0-526 4-950 
i — 3-742 — 2-484 — 0-290 6-836 
© nero, n=10 — 4-373 — 1-659 2-250 12-064 
é=0 — 4-358 — 1-588 2-325 11-649 
f v=1 — 4-374 — 1-667 2-167 11-333 
j — 4-374 — 1-654 2-272 12-176 
n=60 — 4-833 1-040 8-864 26-248 
6=0-0 — 4-826 1-071 8-892 26-034 
6’=0°5 — 4-833 1-036 8-821 25-893 
— 4-833 1-042 8-868 26-270 
ym the 
n= 200 1-834 13-377 28-271 59-682 
é=0-0 1-838 13-392 28-284 59-568 
&’=0-1 1-834 13-374 28-249 59-497 
1-834 13-377 28-273 59-686 
ve find 
(11) all ; 
Table 2. Approximation to logl (case § = — 6’) 
(12) 
(13) x=0-2 x=0-4 #=1-0 x=2-0 
> third 
n=2 0-502 1-004 2-565 5-466 
0-566 1-131 2-828 5-657 
0-495 0-990 2-475 4-958 
0-495 0-994 2-534 5-421 
ecause 
ey n=10 1-234 2-469 6-196 — 
tion 1s 1-265 2-530 6-325 ea. 
1-233 2-467 6-166 — 
1-233 2-468 6-193 _ 
n= 50 2-814 5-629 _ = 
2-828 5-657 — — 
2-814 5-629 — — 
2-814 5-629 — — 
| them 
n= 200 5-650 _ — — 
10 and 5-657 mth you 
tables 5-650 - — 
fensive 5-650 — — 
1 from 
or a set In considering these tables it is important to bear in mind that the values of the logarithm 
metric of the likelihood ratio which will occur in practice will be limited by the sequential test 
iv) the procedure to lie approximately within the limits 
ulating 
























































log 8/(1 — a) < log! < log (1 —f)/a, 











330 On a sequential t-test 


and (remembering that natural logarithms are involved) these limits will in absolute value 
lie in the range from about 2 (corresponding roughly to odds of error 10 to 1) to about 
6 (corresponding roughly to odds of error 1000 to 1). It is only in such a range of log / that it is 
important to have a good approximation. It should also be remembered that some inaccuracy 
in the risks of error is inevitable in any sequential test, due to the discreteness of the sample 
size. That this inaccuracy is unimportant in practice is due to the fact that our minds are not 
sensitive to even large changes in the risks of error—a result significant on the z}> level 
carries much the same weight as a result significant on the ;4, level. In all sequential tests 
this inaccuracy in the risks of error is most marked when the hypotheses being tested are 
far apart and the sample size is in consequence small. In this particular test, this means 
that if (6’—6) is large, the increments in the logarithm of the likelihood ratio at successive 
steps will also be large, so that we shall not, in general, ask that any approximation shall be 
particularly accurate in such situations: quite a rough approximation will clearly suffice to 
show that the critical value has been passed. Another consideration is the fact that values of 
é or 6” larger than 3 are unlikely to occur in practice, since 6 = 3 corresponds to p = 795, 
approximately, and probabilities smaller than this can usually be neglected; in any case, at 
such values of 6 the validity of the assumption that X is normally distributed might well be 
called into question. Finally, it should be noted that it follows from the Schwarz inequality 
that —./nSus +,/n, so that, for example, the values given for x = 4, n = 2in Table 1 
could never occur. 

From all these considerations the conclusion appears to be warranted that the third 
approximation (11) can be used with confidence in all practical cases. As will be seen in §6, 
during the course of a sequential procedure it will usually not be necessary to go beyond the 
first approximation until a decision is about to be reached. 


5. THE EFFECT OF STUDENTIZATION 


It is well known that when the sample size n is larger than 30, the effect of studentization 
on the classical t-test is almost negligible—the t-distribution becomes very nearly normal. 
Adapting this idea to the present case, we arrive at the conjecture that, for n large enough, 
we should be able to take the standard deviation s estimated from the observations as being 
equal toa. If we make this assumption we will be led to use, for the logarithm of the likelihood 


ratio, the expression appropriate to the case where o is known, with s inserted in place 
of o. If 5 = 0, this reduces to 


0’ DY x;/3 — 3nd? = 8'u ./(n— 1) {1 —u?/n}-* — 3nd”? = Jnd’u+ 30’u8/ fn — 3nd". — (14) 

Using the second approximation for g,,(x), the corresponding expression for the logarithm 
of the likelihood ratio is 

J, (8'u) — 4nd"? ~ ./nd’u + 46"2u? — 4nd”, (15) 

which differs from (14) in having the term }6’*u? instead of $6’u5/./n. For n sufficiently large 


these terms are negligible compared with the other terms in (14) and (15), so that our 
approximation (15) converges for large n as we would expect. 


6. NUMERICAL EXAMPLE 


To illustrate the practical use of the approximations given, let us suppose we wish to test 
the hypothesis #, that 6 = 0, against the hypothesis #’, that 6’ = 0-5, with risks of 


test 
s of 


S. RusHtTon 331 


error « = #£= 0-05. The critical values of log/ are then log,#/(l—a) = —2-94 and 
log, (1—)/a = + 2-94. The observations on X that we have taken were obtained from the 
tables of random normal deviates of Wold (1948), 0-5 being added to each tabulated value, 
so that the mean of the population is 0-5and its standard deviation is 1. Thusacorrect decision 
will be reached if we decide in favour of #’. The arithmetical procedure is set out in Table 3. 

In the first column the sample size n is recorded. In the second we keep the cumulative sum 
& ~; of the readings and, in the third, the cumulative sum of the squares of the readings 
v 


> 23. A slide rule with the usual four scales A, B, C, D is useful here, since if the value of 


t 

x; is put on the D scale the value of x? can be read off on the A scale; these values can then be 
added mentally to the second and third columns to form the required cumulative sums. The 
slide rule is again used to form the third and fourth columns; dividing the second column by 
the square root of the third, we form the quantity uw on the D scale. Multiplying this by 
6’ gives us 6’u on the D scale, and consequently 6’*u? on the A scale. The fourth column is 
then obtained by mental division by 4. Then, multiplying 6’w on the D scale by ./n we get 
the fifth column. The sixth column }nd” is a series of constant-multiples of n, and is best 
prepared in advance. The first approximation to the logarithm of the likelihood ratio is then 


L, = Jnd'u— 4nd’? + 482u?, 


obtained by subtracting the sixth column from the sum of the fourth and fifth. This is 
all that need be calculated until one of the critical values + 2-94 is approached. When we 
are near to a decision, the second approximation 


1, = Jn d’u {1 —1/(4n)} — 4nd’? + 30"2u? 

can be obtained from the first by subtracting a, = (./n 6’u)/(4n), obtained by dividing the 
fifth column by 4n, so that L hie, 
Finally, in doubtful cases we may require the third approximation 

1, = .nd’u{(1—1/(4n) + 6’2u?/(24n)} — 4nd’? + 48"2u?, 
obtained from the second by adding 

A, = $a, x }d"2u?, 

so that I, = 1, +p. 


These corrections have been added in brackets in the last few lines of Table 3. It will be seen 
that in this case the procedure terminates at the 24th stage, when the third approximation 
to the logarithm of the likelihood ratio is 2-97, greater than the critical value + 2-94. #’ is 
therefore accepted, correctly. It may be remarked that the data we have taken are somewhat 
unusual, in that the excess 0-03 of the final value of the logarithm of the likelihood ratio 
over the critical value is rather small. It will not often happen that three figure accuracy 
will be required to determine whether or not the critical value has been passed. 

When @ is not equal to 0, it may be advantageous to alter the computational procedure. 
In the case 6 = — 6’, for example, the column }néd’? becomes unnecessary. But unless rapidity 
of computation is of prime importance it is often as well to carry out the procedure just 
described, separately for 6 and for 6’. In this way we obtain the logarithms /,/’ of the 
likelihood ratios for 7 against %,, and for # against #4, where #, states that ~ = 0. The 











332 On a sequential t-test 


logaritim of the likelihood ratio for # against #”’ is then l’—J/. This procedure has the 
advantage that if, for example, /’ and / are both large numerically and both negative, this will 
suggest that the true ratio of ~ to a is nearer to 0 than to either é or 6’, which would mean that 
our initial choice of the hypotheses to be tested was unfortunate, and it would be wise to 
start afresh. 


Table 3. Results of experimental sampling 








(1) (2) (3) (4) (5) (6) (7) (8) 
n <a, x2? }6’2u? Jn ou 3nd”? L, 1,, ls 
1 1-79 3-20 — = ae oo oa 
2 2-19 3°36 0-09 0-85 0-250 0-69 — 
3 —0°13 8-75 — — 0-03 0-375 —0-41 —— 
4 — 1-33 10-19 0-01 — 0-42 0-500 —0-91 —- 
5 — 0-28 11-29 — — 0-09 0-625 —0-71 -= 
6 — 0-87 11-64 — — 0-32 0-750 — 1-06 — 
7 —0-72 11-66 — — 0-29 0-875 —1-16 a 
8 0-20 12-51 — 0-08 1-000 — 0-92 oo 
9 0-36 12-53 — 0-15 1-125 — 0-97 — 
10 0-62 12-60 — 0-27 1-250 — 0-98 — 
ll 2-96 18-07 0-03 1-16 1-375 —0-18 — 
12 3-54 18-41 0-04 1-42 1-500 — 0-04 a 
13 6-12 25-07 0-09 2-20 . 1-625 0-67 a 
14 5-48 25-48 0-07 2-04 1-750 0-36 ~ 
15 5-42 25-48 0-07 2-07 1-875 0-27 a 
16 6-49 26-63 0-10 2-52 2-000 0-62 — 
17 7-81 28-37 0-14 3-07 2-125 1-09 a 
18 8-01 28-41 0-15 3-24 2-250 1-14 —- 
19 9-67 31-16 0-19 3°77 2-375 1-58 — 
20 9-59 31-17 0-18 3-85 2-500 1-53 a 
21 10-30 31-67 0-21 4°19 2-625 1-78 (1-73, 1-74) 
22 11-35 32-78 0-25 4-64 2-750 2-14 (2-09, 2-10) 
23 12-91 36-21 0-30 5-23 2-875 2-65 (2-59, 2-60) 
24 14-01 36-42 0-34 5-68 3-000 3-02 (2-96, 2-97) 
































7. THE SAMPLE SIZE 


It is always an important matter, before carrying out a sequential test procedure, to obtain 
some estimate of the mean sample size associated with it. Unfortunately, since the test we 
have discussed is not linear, in the sense that the logarithm of the likelihood ratio does not 
increase by statistically independent increments as sampling proceeds, the usual formulae 
for estimating the mean sample size do not apply. However, we can apply the idea discussed 
in §5 to obtain a rough idea of the mean sample size. This leads us to use the formula given 
in Wald (1947) for the mean sample size when X is normally distributed with unit variance, 
and we are testing the hypothesis that the mean is é against the hypothesis that the mean 
is 6’. This will give us a lower bound to the mean sample size, and sampling experiments 
indicate that it gives a fair approximation to tae true mean value when this is greater than 30. 








is the 

















S. RusHton 333 


REFERENCES 


Arey, J. R. (1931). Tables of Hh functions. British Association Mathematical Tables, 1. London: 
Office of the British Association, Burlington House, W. 1. 

BARNARD, G. A. (1945). Target Aandicap Charts for Sampling Inspection. London: Ministry of Supply. 

JerFreys, H. (1925). Proc. Lond. Math. Soc. 2nd series, 23, 428. 

Jounson, N. L. & Wexcu, B. L. (1940). Biometrika, 31, 362. 

Levy, H. & Baaoort, E. A. (1934). Numerical Studies in Differential Equations, 1. London: Watts 
and Co. 

Watp, A. (1947). Sequential Analysis. New York: John Wiley and Sons, Inc. 

Wop, H. (1948). Table of Random Normal Deviates. Tracts for Computers, no. xxv. Cambridge 
University Press. 











[ 334 ] 


PROPERTIES OF SOME TESTS IN SEQUENTIAL ANALYSIS 


By A. G. BAKER, B.Sc. 
University College, London 





1. INTRODUCTION 


The general methods of sequential analysis were developed by Wald (1945) and Barnard 
(1946). ‘The reader is referred to those papers for a full discussion of the principles of the 
subject.|The following paper describes these only in outline and gives the results of work 
which became necessary in an attempt to construct a new sequential ‘t’ test. A test for 
discriminating between normal populations according to their mean, using a sequential 
procedute based on the two pivotal mean values 4, and “, = “4)+A when the variance is 
known, |has been given by Wald-(1945). It was desired to see what could be expected if an 
estimate| of the variance replaced the variance in this test. The performance characteristics 
of Wald's test are expressed in terms of certain inequalities. It was therefore decided first 
to ncaa these by means of a large-scale sampling experiment, and so obtain an indication 
of what might result for a test in which the variance was replaced by an estimate. This 
siamet was designed under the guidance of Dr H. O. Hartley, and the results are discussed 





in §3. The three points investigated are: (i) the aecuracy of the first and second types of error, 
(ii) the distribution of the number of samples necessary to reach a decision and (iii) the effects 
of truncition. The results obtained are described below and are consistent with Wald’s 
theory. 

Wald [1945) and Barnard (1946, 1950) give sequential tests for discriminating between 
means when o is unknown. Armitage (1947) compares the first and second of these tests, 
giving the results of their application to sixty samples. In a later paper it is hoped to 
give the esults of further sampling experiments on all these tests, together with the results 
for a new test which is discussed in §4. The three tests referred to above for the case where 
o is unknown are none of them altogether satisfactory, since they are unsymmetrical in 
fg and y,| Also they differentiate between means of yw, and “4, = 4+ 60, where o is unknown. 
Hence, though an experiment may lead to the acceptance of ,, its actual value ie still 
unknown). The theory of the first and third test has not been developed with regard to the 
average sampling number nor the operating characteristic function. Stein (1945) gives a very 
interestir|g two-sample test which may e applied to this problem and for which the power is 
independent of the variance. This test is not, however, a sequential test in the ordinary sense 
of the term, and its average sampling number may exceed that of a sequential test. 


| 





2. METHOD FOR TESTING THE MEAN OF A NORMAL POPULATION WHEN THE 
VARIANCE IS KNOWN 


The following is the test referred to above given by Wald (1945). Let 


(a) H, be the hypothesis that the mean of the parent normal population is “9; 
(b) H, be the hypothesis that the mean of the parent normal population is ,,; 
(c) 21,2, 23, ...,%, be the numerical results for the first n experiments; 

(d) a, B be the probabilities associated with the first and second types of error. 


we 


rnard 
of the 
work 
st for 
ential 
nce is 
| if an 
ristics 
1 first 
sation 
. This 
ussed 
error, 
ffects 
Vald’s 


jween 
tests, 
ed to 
esults 
where 
cal in 
10WN. 
: still 
0 the 
, very 
wer is 
sense 





A. G. BAKER 335 
: 1— 
Write A= ae and B= of Ze = Xj —4F(My + Mo)- 
The test rule is then as follows: 
If = 242 ——-log, A, accept H,, 
i=1 1— Ho 
high (1) 
D%< log, B, accept Hp, 
i=1 -i~ Fo ; 


otherwise continue testing. 

A word seems necessary as to the terminology used. If there are only two admissible 
alternatives, that the population mean equals jy or equals ,, then we aim at discrimination 
and end up with either ‘accepting’ H, or H,. Commonly, however, a wide range of values of 
p are possible, and we require a procedure for dividing the material sampled into two 
categories. Thus ‘accept H,’ means ‘assign to category 0’, ‘accept H,’ means ‘assign to 
category 1’. The properties of the procedure are described by the operating characteristic 
function defined below, and the procedure is adjusted to give certain desired long-run 
results for populations having means at the pivotal values ~, and ,. When the term ‘accept 
H,’ (i = 0, 1) is used below, it is to be interpreted in this sense. 

The actual magnitude of « and £ are only approximately obtained in practice. If «,, , are 
the true values, Wald gives the inequalities, when j, > Mo, 





1-B _ _ 6-B . 
34—B SS 34-8 (2) 

B(A-1) B(8A—1) 

$4—B S'S “34—B ° @) 


3 O(— Ma Mo)lo) 
G($(uy—Mo)/0) ’ 





where 


(4) 


G(x) = 53, e~t* dx. (5) 


Operating characteristic function 
If w is the true mean of the population, then the probability of accepting H) is a function 
of uw. This function of 4, L(y), is called the operating characteristic function (0.c. function). 
Wald (1945) gives an approximate formula for L(y): 


: AMM—1 
L(t) = Fm — Brn’ (6) 


Hy + Mo — 2h 

h(u) = ————_.. 

) 1 — Ho 

A different method of obtaining the approximation is as follows. Let H, i.e. the hypothesis 
that the mean is 4, be true. Consider the test when H is tested against H,, (the hypothesis 
that the mean is “7, +)-- 4). Leta’, #’ be the magnitudes of the associated basic errors. The 
constants of the test are then 7 
o2 B’ o2 l — p’ 
1 d 1 : 

fy + My — 2 a PT an Be ’ 


where 








Oo 
Hy + My 2H a 











336 Properties of some tests in sequential analysis 


It follows that since H is true the probability of accepting H is approximately 1—a’. 
Compare this test with that of testing H, against H,. If the constants of the two tests are 
equated we have 








o 1-£ o 1-f' ; 

- log, —— = lo —, 

yy | tye 

o? B o? p’ (7) 








log, —— = lo 7. 
My My 1 & yt My— 2 1—a" 

By equating the constants the probability of accepting H when H is true is equal to the 
probability of accepting H, when H is true, for decisions in the two tests take place 
simultaneously. Solving equation (7) for 1—«’, we have 

be Tiyy AMO=} 
1—a' = LH) = fia pre 
Average sampling number 

The number of observations necessary to reach a decision will be called the decisive 
sampling number, or D.s.N. Following Wald’s notation, the average D.s.N. will be called the 
average sampling number or 4.s.N. Wald shows that if ~ is the true mean 


{{1— L(n)} log. A + L(u) log. BY 























A.S.N. = E,(n) = — (8) 
e Sf * (ue — (M1 + Mo)) 
Since the formula is approximate, Wald obtained the following limits. 
Let A = (4, —o)/9, 
Thy = L— FH, + Mo), 
= O(—%) 
Dat tes O(4,) ot 
f= Mtg) a 
1 
=> —iz* 
where P(x) I on)° , (11) 
and ¢(z) is given by (4). 
Tf 4 > $(444 + Mo), 1 > Ho 
L(u){log. B+ §°} + {1—L(u)} log, A — » (n) < log, B+ {1—L(u)} {log, A + §} (12) 
(Hy — Ho) {4 — (Hy + Ho) }/0? . 4 (My — Mo) {4 — (44 + Mp) }/07 
If ¢<$(4,+ Mo), 41> Ho 
Lu) log, B+ {1—L(u)} flog. A + §} - 6,(n) < Mu ilog. B+ 6} + {1-Lw)}logeA (4) 


(fy — Mo) {4 — (Hy + My) }/ 0? 


3. SAMPLING EXPERIMENT 


In sequential analysis the D.s.N. is a random variable. Its distribution is important for two 
reasons. First it enables some idea to be obtained of the probability of getting a large D.s.N., 
for although the a.s.N. might be small, the spread of the distribution is also important. 
Secondly, it gives an indication of the effect of truncation on the two types of errors. The 
following method of truncation has been suggested by Wald. 


4 (My — Mo) {4 — $y + Mo) }/ 0? 


to the 
place 


>cisive 
ed the 


(8) 


(9) 
(10) 


(11) 


(12) 


(13) 


A. G. BAKER 337 


If it is decided to cease testing after ny observations and a decision has not already been 
reached, - 
accept H, if ¥ z,;>0, 

i=1 
A (14) 
and accept A) if >) z;< 0. 
i=1 


The sampling experiment which was planned to investigate these points was conducted 
as follows. Hollerith cards were available, loaned by the National Physical Laboratory, on 
each of which was punched the value, say x;, of a random normal deviate taken from Wold’s 
table (1948). These deviates come from a population having zero mean and unit variance, 
so that ~) was taken as zero and o as unity; #, was taken as unity. The cards were then fed 
into the Hollerith Senior Roller Tabulator, so adjusted that the machine computed 

n n 
Z, = 1% = X (%,—9-5). (15) 
i=l i=1 
As soon as Z,, went outside either of the limits of equation (1), with a = # = 0-01, i.e. 4-60 
and — 4-60, the machine printed the corresponding D.s.N. (in a way which indicated which 
limit had been passed) and cleared. The machine then started again automatically, the 
process being repeated. 

With a and/or />0-01 the limits were narrower and the appropriate D.s.N. had to be 
obtained by examining the printed values of Z,,. ‘The total number of tests performed was 
2003. The results are summarized in Tables 1 4 and 2. Table 1 4 shows that the results of the 
sampling are consistent with Wald’s theory. The sampling value of « is approximately 0-7 
times the nominal value; thus there is a considerable margin of safety. The mean of the limits 
given in (2) gives a more accurate value for the first type of error than the nominal value of «. 
This is illustrated in Table 1B. 


Table 1a. Results of sampling experiment for investigating errors 





ES “ 0-05 0-025 0-01 0-01 
re one 0-05 0-025 0-01 0-05 





Limits for « given by (2) 0-0222-0-0515 | 0-0108—0-0254 | 0-0045-0-0101 | 0-0045—0-0100 


Experimental estimate of a 0-0354 0-0155 0-0070 0-0075 
A.S.N. or & (n) 5-6 7-2 91 6-1 
Limits for A.s.N. given by (13) 5-2-7-2 6-9-8-9 9-0-11-0 5-8-7-8 
Experimental A.s.N. 7-0 8-8 10-9 715 























Table 18. Comparison of sample value of « with the average of 
limit values given in equation (2) 








Sample : Mean of limits 
estimate of « given in (2) 
0-0354 0-0368 
0-0155 0-0181 
0-0070 0-0073 
0-0075 0-0073 














Biometrika 37 22 











338 Properties of some tests in sequential analysis 


This suggests that as an empirical rule, the value of the actual a should be taken as the 
mean of the limits given in equation (2). 


Table 2. Showing effect of truncation at n = nv 



































Observed frequency of 
wrong decisions Proportion of | Upper limit of 
wrong decisions |_ probability of 
@ B ™% (experimental) wrong decisions 
Before At Total (theory) 
truncation | truncation 
(1) (2) (3) (4) (5) (6) (7) (8) 

0-05 0-05 6 40 205 245 0-1223 0-1528 
8 46 149 195 0-0973 0-1216 

11 59 58 117 0-0584 0-0932 

16 66 21 87 0-0434 0-0697 

foe) 71 0 71 0-0354 0-0515 

0-025 0-025 7 14 194 208 0-1038 0-1145 
ll 24 86 110 0-0549 0-0707 

14 29 45 74 0-0369 0-0534 

21 30 13 43 0-0215 0-0350 

oe) 31 0 31 0-0155 0-0254 

0-01 0-01 10 6 122 128 0-0639 0-0658 
14 14 54 68 0-0339 0-0396 

19 14 22 36 0-0180 0-0237 

28 14 7 21 0-0105 0-0136 

00 14 0 14 0-0070 0-0101 

0-01 0-05 6 1 244 245 0-1223 0-1195 
9 4 152 156 0-0778 0:0755 

12 13 79 92 0-0459 0-0508 

18 15 23 38 0-0190 0-0263 

co | 15 0 15 0-0075 0-0100 

















Table 2 has been arranged to show the effect of truncation when H, is true. Four levels of 
truncation have been taken, namely n) = [@,(n)], [1-5 &9(n)], [2-0 &9(n)] and [3-0 &,(”)], where 
[2] means the smallest integer greater thanx. The appropriate values of n, are shownin column 
(3); % = 00 corresponds to the case where the sequential procedure is carried through to the 
end without truncation. Cols. (4) and (5) give the number of wrong decisions before truncation 
at m) and at truncation, on using the rule (14). Col. (6) is the sum of cols. (4) and (5), while 
col. (7) is col. (6) + 2003. The final column gives figures derived from Wald (1947, §3-8) and 
equation (2). The figures in col. (7) are always less than those in col. (8), except in twoinstances 
where the difference is no more than the sampling standard error. It will be seen that if 
a = # and truncation is made at n) = [3@,(n)], the risk of error remains below, or only just 
above, the nominal value «. However, when «</ it is necessary to truncate at [4&(n)] or 
[5&,(n)] to keep within the nominal error. 

The sampling experiment thus leads to the following conclusions: 


(1) A good value for &,(n) is obtained by using the upper limit for &,(n) given by (13). 


gi 








A. G. BAKER 339 























| the (2) A good value for « (and possibly /) is obtained by taking the mean of Wald’s limits 
given in (2). The degree of agreement is shown in Table 1 8. 
} (3) When « = fit is safe to truncate at [3&,(n)], where &,(n) is calculated from (8). When 
Be a<f it is advisable to truncate at [4&,(n)] or even [5&,(n)] to keep error less than nominal 
| value. If a> this suggests that it might be satisfactory to truncate at [2&,(n)]. 
of These conclusions, f course, apply strictly only to the situation where ~,—/4) = o and 
fi | Hy is true. It is hoped later to use some further sampling results of wider applicability. 
Distribution of the decisive sampling number 
olf. The exact theoretical derivation of the distribution of the p.s.N. is a difficult problem. 
Kac (1945) gives the form of the distribution, and as far as the present investigation has gone 
this appears to give a satisfactory fit to the data obtained. In Kac’s form we have 
od ’ 
P{n>n'} = SANE, (n’>2). (16) 
r=1 
Table 34. Comparison of fitted distributions with actual sanvple 
distributions of decisive sampling numbers 
) 
FS a=f=0-05 a= f=0-025 a=f8=0-01 a=0-01, 8=0-05 
A, = 08153, G,=673-4 | A,=0-8362, G, = 676-9 | A, = 0-8482, G,= 773-9 | A, =0-8254, G, = 627-9 
| A, = 0-2604. G,= — 6850 A, = 0-2857, G,= — 5402 
| Sample | A,=0-1366, G,=9125 A,=0-1491, G, = 6936 
size 
“y | } Observed Fitted Observed. Fitted Observed Fitted Observed Fitted 
frequency | frequency |frequency| frequency |frequency} frequency | frequency frequency 
] 12 11-8 0 _— 0 _— 8 9 
2 156 153-5 60 — 7 — 141 14] 
—- | 3 248 267-2 140 — 76 — 239 249-9 
) + 270 268-8 217 — 100 —_ 253 253-3 
of 5 257 236-8 213 — 157 mp 252 231-0 
6 206 196-2 219 —_ 187 -- 199 195-9 
ve 7 156 161-2 193 193 183 — 165 163-4 
on 8 141 131-4 158 161-6 177 — 137 135-6 
h 9 101 107-0 125 135-2 157 — 97 111-9 
» 10 81 87:3 112 113-0 139 149-2 81 92-4 
on. 11 61 71-1 99 94-5 119 126-6 69 76:3 
ile \ 12 66 57-9 85 79-0 107 107-4 68 63-0 
13 58 47:3 63 66-1 83 91-1 59 52-0 
id 14 44 38-3 54 55:2 78 77-2 47 42-9 
es 15 34 31-4 43 46-2 58 65-5 37 35°5 
" 16 19 25-0 41 38-7 65 55-6 23 29-3 
if 17 20 20-9 21 32:3 45 47-2 22 24-2 
st 18 16 17-0 37 27-0 38 40-0 20 20-0 
19 13 13-9 26 22-6 31 33-9 17 16-5 
= } 20 10 11:3 15 18-9 29 28-8 12 13-6 
21-25 27 31-7 54 56-9 92 90-2 36 39-7 
) 26-30 4 11-6 18 23-2 39 39-5 11 15-2 
31-35 2 4-1 5 9-4 21 17-4 6 5-9 
> 36 0 1-0 4 4:3 15 11-4 4 2-7 















































340 Properties of some tests in sequential analysis 


Table 38. Frequency of decisions after [€,(n)], [1-5€,(n)], [2-084(n)], [3-0 4(n)] 


















































’ 





a=8=0-05 a=f=0-025 a=f=0-01 a=0-01, 8=0-05 
Estimated Estimated Estimated Estimated 
M, | Frequency} probability | n, | Frequency | probability | n) | Frequency] probability | n | Frequency | probability 
n>No n> n>MNp n>MNo n>Ng n>Ny n>No n>, 
6 854 0-4261 7 961 | 04795 10 $20 0-4092 6 911 0-4546 
8 557 0-2779 11 467 0-2330 14 433 0-2161 9 512 0-2555 
ll 314 0-1567 14 265 0-1322 19 196 0-0978 12 294 0-1467 
16 93 0-0464 21 62 0:0309 28 39 0-61.95 18 86 0-0429 
- , 
Hence P{in=n'} =D AYE, (n'>3), (17) 
r=1 
where G, = (1-A,)F/A, and 1>A,2A,_2>..., 


and F, is independent of n’. We shall use this expression also for P{n = 2}. 

The method of fitting equation (17) to the observed distributions of Table 34 was purely 
empirical and based on the assumption that the largest A, i.e. A,, would play a predominating 
part in the tail of the distribution. It will be seen that in two cases a reasonable fit to observa- 
tion has been obtained using three pairs of parameters (A, @). In the other two cases the fit 
at the start proved more difficult to secure, and the approximation given by (A,, @,) only is 
shown. The distributions do not differentiate between right and wrong decisions. It is hoped 
later, with further sampling, to be able to distinguish between them. The greatest value for 
n for the last three cases was 52. This was an isolated value, for the next largest was about 
40 for all three distributions. Considering the distribution as continuous, it was found that 
P{n > &,(n)} + 0-47. This suggests that approximately half of the decisions will be reached 
before &,(n). The long tails show that there is a danger of obtaining a large p.s.N. and hence 
truncation at some point is advisable. Further work on these distributions will be given in 
a later paper. 

The results of the sampling experiment show that for the case investigated, namely, where 
(41 —/9)/o = 1, Wald’s test is safe, in the sense that the true risks «, and £, are well below the 
nominal] risks « and f. It seems probable that this will also be the case for the test discussed 
below, when an estimate of the variances is used in place of the true variance. 


4. A SUGGESTED SEQUENTIAL TEST FOR THE MEAN OF A NORMAL POPULATION 
USING AN ESTIMATE OF THE VARIANCE 


If the character of the problem is such that we wish to define the difference ~, — 4) between 
the pivotal values absolutely, and not in terms of an unknown standard deviation, it seems 
natural to explore the performance characteristics of a sequential test in which an estimate 
of o is used. We shall therefore suppose that we have available a mean square estimate 
8* of o?, based on f degrees of freedom, so that fs*/c* is distributed as x3. This s? may either 
have been obtained from some previous series of observations, or we may suppose that the 
first f+ 1 observations of the current series are set aside to estimate o* before the sequential 
procedure is brought into play. 





So 





a, 
=0-05 ? 
stimated 
obability 
n>, 


———______ 


00-4546 
0-2555 
0-1467 
0-0429 


—__—_——_.. 


17) 


en 
ms 
ute 


ate 
er} 
he ) 


ial 





A. G. BAKER 341 


In the test now examined this value of s* will be used in place of o? in Wald’s test of the 
preceding sections, but the decision constants A and B will be modified. Suppose these new 
constants are ~s*D(a, Bf) al 8°0(a, Bf) 

/4— Fo -i-Fo 
where C(a, £,f) and D(a, £,f) represent functions of a, # and f. In the case previously stated 
(o known) where f is in effect infinite, we have 

C(a,B,00) = log, A, D(a, f,0o) = —log, B. (19) 
The reference set of the new test is two-dimensional; we have first to consider the distribution 
of sequential results in the subset for which s? is fixed and then to allow s* to vary randomly, 
following the x? law. 

The problem is first to derive for a given s* the relation between C(a, £,f), Dia, 8,f) and 
the associated nominal risks «(s?), £(s*). This relation is then used to determine O(a, £,f), 


D(a, £,f) so that “ ¥: 
Pr } “alst) pst) ds? = a, i} “Bls*) pst) ds* = B. (20) 


In future we will write C and D instead of C(a, 8, f) and D(a, £,f), it being understood that 
they are functions of these variables. 
For a particular s? we have 


' (18) 








#C = o* log, : =f ~, 82D = —o*log,——_, I cee (21) 
Solving these equations, we obtain ; 
iy 1—exp{—Ds?/o? 
a(s) = exp (Cs"]o%}—exp{—Ds¥/o¥}’ (22) 
(exp {Cs*/o} — 1) exp { — Ds?/o?} (23) 





A(s?) = 2/2 212 
exp {Cs?/o*} — exp {—Ds?/o?} © 
Then multiplying both by p(s*) and integrating out with respect to s*, we have after making 
the transformation ¢ = s/c, 
« tif-1 Fas (1 ate e~Dt) dt i 


[ate ple) dst = NY pgm | Sao (24) 





oo g8f—1 reap (et — 1) 


and [Ae 206 as! = ON pgp [Gago t= (25) 


The integrals can be solved by expanding the denominator in series, interchanging the 
summation and integration signs and next integrating. This gives 





a = (Af {f+ 0 +r(C-+ D)-Y— (AF + (r+ 1) (C+ DI} (26) 
and B= Gf! S (f+ D+n(C+D)¥— s+ +1) (C+D)¥}, (27) 


In any particular case we require C and D to satisfy (26) and (27). 
For values of f = 4, 6, 8 and 10 it is useful, for checking computation to note that (26) and 
(27) can be written in the form of polygamma: functions (Davis, 1935). Thus 





ater aaen (fee) wo) a 
ii ja LAO Dre fa) efile) ox) 














342 Properties of some tests in sequential analysis 


In the computation of C and D for specific values of «, # and f, it was found that the series 
rapidly converge. For f > 6 the values of C and D were obtained for f even and the odd values 
obtained by interpolation. In some cases it was necessary to use the fcurth differences in 
Bessel’s formula. The values, which are given in Table 4, are accurate to the second decimal 
place except for f<3. If f>40 and a = f, an approximation accurate to 0-0025 is given as 
follows: aif 
C=D A, (30) 
i.e. only the first term in the relevant series is used. The table can thus be quickly extended 
in this manner. As is to be expected 

lim C = log, *—F, lim D = —log, B 


f-o f-oa l-a 


Table 4. O(a, £,f) and D(a, B,f), the constants to use in conjunction with s*/(u,— Mo) 








f a=f=0-01 a=f=0-025 a=f=0-05 
C=D C=D CcC=D 
1 1800-00 250-00 68-00 
2 67-00 26-00 13-60 
3 25-00 12-84 7-50 
4 15-93 9-23 5-84 
5 12-10 * 7-60 5-08 
6 10-24 6-70 4-65 
7 9-07 6-13 4-33 
8 8-31 5-73 4-12 
9 7°75 5-44 3°97 
10 7-35 5-23 3°85 
11 7-03 5-06 3-76 
12 6-78 4-92 3-68 
13 6°57 4-81 3:62 
14 6-40 4:72 3-56 
15 6-26 4-64 3-52 
16 6-13 4:57 3°48 
17 6-03 4-51 3°44 
18 5-94 4-46 3°41 
19 5°86 4-41 3°39 
20 5-79 4:37 3°36 
25 5-52 4-22 3°27 
30 5-35 4-12 3-22 
35 5-23 4-05 3-18 
40 5°15 4-00 3-15 
oO 4-60 3-66 2-94 




















The question naturally arises as to the accuracy with which the nominal « and f represent 
the true risks. When the test with the appropriate C and Dis applied to the subset of sequential 
samples with a fixed value of s?, Wald’s limits of equation (2) will apply to the true a,(s?) and 
§,(s). Clearly, through equation (20), it should be possible to obtain limits which will include 
the true overall risks ~, and £, of the test with a high degree of probability. If, say, it could 
be shown empirically or otherwise that for Wald’s test with o known and « = £, 


06a < a, < 0-8a, 


then the same inequality would hold for the present test. For the moment it is not possible 
to do more than state that it seems likely that the test is safe, in the sense that a,<a, £,</. 





of 
by 


T 


aries 
lues 
Ss in 
imal 
n as 


(30) 


ded 


ent 
tial 
und 
ide 
uld 


ble 





A. G. BAKER 343 


If this is so, truncation of the test at 3 times the a.s.N. should be again justifiable, without 
the risks exceeding their nominal values. Formulae for the a.s.N. are given later. 


Example. Given that an estimate of the variance based on f = 14 degrees of freedom is 
2-31, it is desired to construct a sequential test with pivotal values for yw at 5-3 and 7-0, taking 
a = #=0-01. From the table we have 


C(0-01, 0-01, 14) = D(0-01, 0-01, 14) = 6-40. 


The decision constants are thus 
+ 8°O|(j4y— fg) = £ 8°70. 
The procedure is then: 


Accept the mean to be 7-0 if > {a;— 6-15} > 8-70. 
i=1 


n 
Accept the mean to be 5-3 if } {x;— 6-15} < — 8-70. 
i=1 
Otherwise, continue testing. 


In cases where a + fitis useful to note that (1 — )/ais not affected considerably by a change 
of £, nor is f/(1 —«) affected considerably by a change of a. Thus a rough guide can be obtained 
by using the tables as follows: 

When a+ obtain C(a,a,f) and D(f,f,f) from the table. These values will give rough 
estimates of C(a, ,f) and D(a, #,f) respectively. 


Example. Given that an estimate of the variance based on f = 12 degrees of freedom is 
1-5, a test is needed to distinguish between means of 4 and 5, taking a = 0-01, £ = 0-05. Then 


C(0-01, 0-05, 12) = C(0-01, 0-01, 12) = 6-78, 
D(0-01, 0-05, 12) = D(0-05, 0-05, 12) = 3-68. 
The test becomes: 


n 
Accept the mean to be 5-0 if > (a; — 4-5) > 10-17. 
i=1 


n 
Accept the mean to be 4-0 if } (a;,— 45) < — 5-52. 
i=1 


Otherwise, continue testiag. 
The accuracy of the approximation in this case can be seen by substituting C and D used in 
the test in equations (26) and (27). Then 
a = 0:0086, £ = 0-0550. 
This illustrates the fact that the approximation decreases the lower and increases the upper 
value. 


Operating characteristic curve 


When the variance is known, we have 
: AMM —1 
Lp) = Ali) — Bin)? 


My + Mo — 2h 
h eS ee 
) /4— fo 


where 











344 Properties of some tests in sequential analysis 


Let L,(u, 8) be the probability of accepting H, when s? is used as an estimate of the variance. 
Then again, taking ¢ = s?/o?, 
[ (exp {h(u) Ct}— 1) : 
L 8 =— oa (3 I) 
lM 8) = exp fh(y) Ct} —exp{—Mu) Dt} 
Multiplying L,(, s*) by p(s?) and integrating, we have 
1 (2% #-le-Vfexp {h(u) Ct}— 1} dt 
L = af oy a ow? a a ’ 
)= (89) ran) 0 exp {h(u) Ct}—exp {— h(n) Dt} 
where L,() is the average approximate value of Lu, 8"). The integral can be obtained, as 
were those for a and f, giving 


























ro) 
. 7 noi q ’ - 1 RY 
Lu) = ($f) Dia f+ rh(u) (C + D))-¥ — (af + (r+ 1) h(w) (C+ D))-Y}. (32) 
r= 
10 7 T T T T T y 
Bsai. 
ea. = a= 6=001 
Oo ~ ~ TF be 
ee na 
. 4 Tests: 
zs ™ oY (1) Broken curves,o estimated 
Ose a== 00s > i ‘ (2) Solid curves,o- known 4 
‘ \ 
‘ ‘ 
7. . 
07 ‘ ‘ ~ 
v\1 
A 

O6F \ a 
~ ‘ 
XS \ 
S 05 
2 
¢ 
vw . \ 

04 ws 4 

‘\ \ 
1\s 
+ \S 
. \ » 
03 \ if 7 
‘ ‘ 
‘ \ 
\ \ 
O24 \ \, 4 
. . M 
Ps ae 
04 " 7 4 
- “War -> ~ 
roe ~ 
00 i 1 j 1 n l La -- 
01 02 03 0-4 oS 06 07 08 09 10 
Scale of # 


Fig. 1. Operating characteristic curves for sequential tests. 
Case: f= 10, wy = 0, wy, = 1,0= 1. 


The graphs in Fig. 1 illustrate the operating characteristic curve for two cases when 
f = 10, supposing that as before “9 = 0, 4, = 1, o = 1. The corresponding 0.c. curves are 
also drawn for the case when the variance is known. If we are aiming at classifying the 
material sampled into two groups, one for which ~<}(# 9+ ,) and the other for which 
> 4(4o+/,), it will be seen that the curves suggest that using the test based on s? we shall 
succeed in making a greater number of correct classifications. This result follows from the 
use of wider decision limits, involving the examination of more samples (than for the case 
a7 known) before reaching a decision. 





rep 


Si 


os 


nce. 


(31) 


en 
re 
he 
ch 
il] 
he 
se 





A. G. BAKER 345 
Average sampling number 


When the variance is known, we have from (8) 
o*{L(u) log, B + (1 L(u)) log. A} 
(My — Ho) (4 — H(4y + Mo) 


Let &,(f, 8", m) be the expected value of the sample size within the conditional set when s* 
replaces a? and write 





6 (n)= 


H 


E,(f,n) = | e.tten ) p(s?) de?, 


6 Af, 8", 2) (f, 8”, n) te aS Lu, 8?) D+(1 — L,(u, 8*)) C\ 
é ,(n) “oF 





Lu) log, B+ (1—L()) log, A J” 




















= 
sh 
= 
°c 
s 
wh 
x} 
3 12 - 
” (-® f~40 

V1 ~ 

l 1 j 1 1 1 L = 
00 01 02 03 04 os 06 07 08 09 10 
Scale of P» 


Fig. 2. Showing &,(f, n)/@ ,(n) as a function of p. 
Case: a= f= 0-01, 4, =0, 4, =1,0=1. 


Substituting for L,(u,s*) the value given by (31), and multiplying by the distribution 
function p(s*), we can integrate as before to give 


Ef,2) . (afte 
é ,(n) ‘i (1 a L(y )) log, A+ L(x) log, B 


—C(4f + A(u) (r +1) (C + D))-4) — D(A + rh(u) (C + D)) “4+ 
+ DAS +h(u) [0+ r(0-+ Dy)-Heny]. (33) 





oF {O(4f + h(u) (C +r(C + D))) 4s» 


When @ = /, the formula becomes 


(Af) ADC o i 
1 —2L(x) log, 4,2, — (4 f+ 2rh(m) C)-4eD 
+ (bf + h(q) (2r-+ 1) C)-OD — (hf + 2(r+ 1) h() C)-Hs+9}, (34) 


The graphs of &,(f,)/&,(n) have been drawn in Fig. 2 for a = £ = 0-01 and f = 10, 20, 40. 
The curve shows that the ratio decreases as the true mean y shifts further from }(1, + 4). 





E(f,n)= i 











346 Properties of some tests in sequential analysis 


Fig. 3 gives the value of &,(f,n)/&,(n) for « = “9 plotted against f. This shows that in the 
two cases considered, « = £ =.0-05 and « = # = 0-01, the ratio converges to unity. For 
f = 10 the ratio is approximately 1-5 in both cases. Thus if it is possible to obtain an estimate 
of the variance based on 10 or more degrees of freedom, the average sample size in the 
neighbourhood of ~ = jy is only increased by 50 %. 





' Tt T T 


& 


3 


, ml by, (a) 


‘e 
a 











Scale of 








i l n j 
5 10 15 20 25 30 
Seale of f 
Fig. 3. Showing @,,(f, »)/@,,(n) as a function of f, for w= pp. 
ase: fy—fg=1, T= 1. 





5. CONCLUSION 


The sampling experiment to investigate Wald’s test to differentiate between two normal 
populations by their mean, when the variance is known, shows that in the case examined 
with “4, —/4) = o the test has a considerable margin of safety. In view of this and the long 
tails of the distributions, when « = / it is advisable to truncate at [3&(n)] as suggested in the 
paper. 

Since there is this safety margin, the test described in the paper in which an estimate of 
the variance replaces the variance should also give satisfactory results. As a further develop- 
ment it seems possible that a sequential ‘t’ test can be evolved in which the estimate of the 
variance is modified as the sampling proceeds. Such a test will be examined in a later paper. 


I would like to thank Dr H. O. Hartley, Dr N. L. Johnson and Prof. E. 8. Pearson for their 
advice and encouragement on the work of this paper, the opportunity for undertaking which 


was made possible by the award of a maintenance grant from the Department of Scientific 
and Industrial Research. 


REFERENCES 


ARMITAGE, P. (1947). Some sequential tests of ‘Student’s’ hypothesis. J. R. Statist. Soc. Suppl. 9, 
250-63. 

BARNARD, G. A. (1946). Sequential tests in industrial statistics. J. R. Statist. Soc. Suppl. 8, 1-21. 

BARNARD, G. A. (1950). Statistical inference. J. R. Statist. Soc., Series B, 11, 115-43. 

Davis, H. T. (1935). Tables of the Higher Mathematical Functions, 2. Bloomington: Principia Press. 

Kao, M. (1945). Random walk in the presence of absorbing barriers. Ann. Math. Statist. 16, 62-7. 

Sretn, C. (1945). A two-sample test for a linear hypothesis. Ann. Math. Statist. 16, 243-58. 

Watp, A. (1945). Sequential tests of statistical hypothesis. Ann. Math. Statist. 16, 117-86. 

Watp, A. (1947). Sequential Analysis. John Wiley and Sons Inc.; Chapman and Hall Ltd. 

Woxp, H. (1948). Tracts for Computers, no. 25. Cambridge University Press. 





the 
For 
ate 
the 


[ 347 ] 


THE UNBIASED ESTIMATION OF HETEROGENEOUS 
ERROR VARIANCES 


By A. 8. C. EHRENBERG 
Statistical Laboratory, University of Cambridge 


INTRODUCTION 


1. The occurrence of heterogeneous error distributions in statistical data has occasionally 
been discussed, in particular in connexion with the Analysis of Variance technique, where 
the errors are generally assumed to be homogeneous. When heterogeneity of error variance 
is suspected, it is usual to evolve statistical techniques which allow fairly exact tests of 
significance to be made; for instance, certain types of data can be transformed to stabilize 
the error variance before applying the more usual methods of analysis (cf. Bartlett, 1947 }. 

In some work out of which the following considerations arose it was desired in the first 
place to estimate the heterogeneous error variances. Thus when a number of members of 
a food research laboratory, trained as ‘tasters’, score a series of samples of some food for. 
a sensory quality factor, such as odour or flavour, on a linear scale, we are interested in 
estimating each taster’s accuracy. Since the parel members are supposed to give the same 
score to any one sample (by the definition of the scoring scale and their training), we can 
differentiate between their consistent differences (bias) and the more of less random residual 
errors. 

We consider q series of parallel observations x;; on p samples in a two-way classification 
corresponding to the model 


Liz = A+ Py + Ey; (¢= 1,...,p39 = 1,...,9), 


where a; is the constant common to the observations on the ith sample, /; is the bias of the 
jth series, and ¢;; is the (residual) error of the ith observation in the jth series, distributed 
about zero mean with variance o?, depending on the series. Where necessary, we shall assume 
the error distribution to be normal. We shall deal principally in this paper with the estimation 
of the variances oj. 


2. Cochran (1938, 1947) has briefly discussed the analysis of a numerical example which 
can be represented by this model, but he has assumed (justifiably for the particular example) 
that at least two series (viz. treatments) always have the same error variance, and he gives 
unbiased estimates of these (1947). We shall discuss the more general problem. 

Cochran (1937) has also considered the combination of replicated experiments:each of 
which has a different error variance; there are, of course, no particular difficulties in obtaining 
valid estimates of these error variances. 


THE MAXIMUM LIKELIHOOD APPROACH 
3. The likelihood function for the pq values x;;, Assuming normal error distributions, is 


q Pp @ Boke (i, same 
log L = const.— > ploga;— > >» es 
j=1 i=1j=1 <0; 











348 Unbiased estimation of heterogeneous error variances 


Differentiating with respect to the parameters a, 2, 7, we obtain for the maximum likelihood 
&, B, &, the equations 


q —4,—-B 
$ Gar 8e- Pd) og. for i 1,...,9, 


“. 

P (x4.,—8,— 

> (x,; oe B;) = 0, for j = Ba ei 
i=1 Oj 


2s Sy an aE =0, forj =1,...,q. 
j i=1 
These equations cannot in Bh . solved algebraically, except for the solutions 


B,-B. =a“,-a, forj=1,...,4q, 
where a period in the suffix denotes the average over the appropriate values, for example, 


4. Complete solutions in two special cases are of interest. 
(a) All of = o°, i.e. there is no heterogeneity. Here 





6= 5 (Gj, — T+) 
a P 
This is a biased estimator, as ; 
PY 
and it is consistent only if both p and q tend to infinity. 
(6) For two series only* (q = 2) we necessarily have 


A Liy — Ly — Ug t+)? 
As 2 (Ti —%1—Vgt Xs 
oj = of = . 





i=1 
Even if o} = o} = o°, the estimator is strongly biased and inconsistent (as p— 00), since 


op 
When o} +03, the expected value of 3 is a function of the irrelevant parameter o?, for 
B63) = P= (ot + 09) 


5. Neyman & Scott (1948) discuss the estimation of what they call structural parameters, 
which are involved in the frequency distributions of an infinite number of observations, when 
there are,also an infinity of incidental parameters, which ‘occur’ only a finite number of 
times. For example, in the model of §1, the a, are incidental, and the f, and ; are structural 
parameters (for q fixed or finite). 

Neyman & Scott illustrate by an example similar to that of §4 (a) that maximum likelihood 
estimates of structural parameters need not be consistent; in another example—in our 
notation all «’s and /’s are equal, the number of observations p, in the jth series is fixed, 
and the number of series g can tend to infinity—they show that the maximum likelihood 


* This problem ap to be the basis of a remark by Carter (1949); see also Kendall (1950). 
pears ( 


mm mos — « 


00d 


le, 


A. 8. C. EHRENBERG 349 


estimate of the common mean value @ is consistent, but does not have maximum asymptotic 
efficiency if the p,; are unequal. 

Difficulties in the estimation of structural parameters are occasioned by the presence of 
the incidental parameters, the number of which increases with the number of observations. 
But it is not clear how appropriate the distinction between structural and incidental para- 
meters is to their estimation (by maximum likelihood). Thus the equations of §3 give 
unbiased, consistent and most efficient estimates of the structural parameters £;; and the 
substitution of some valid estimates 8} of the error variance oj in the equations for the 
incidental parameters a; yields estimates which are consistent, presumably most efficient 
(given the estimates of of) and approximately unbiased (if the variance estimates have large 
numbers of degrees of freedom so that the sampling biases of the expressions 1/8} are 
negligible). 

Neyman & Scott discuss the derivation of consistent estimates of the structural parameters. 
In the next two sections we shall discuss wnbiased estimators of the error variances 07, 
deriving in the first place some special forms by rather ad hoc methods: such estimators as 
would in practice be used will also, in fact, be consistent. 


SOME UNBIASED ESTIMATORS 
6. Using as (unbiased) estimates of the «; the unweighted means 2; of the observations 
for the ith sample (and assuming a linear constraint = 0 on the parameters /;), we obtain 
as an approximate solution of the maximum likelihood equation for oj ($3) the expression 
s (%j;—%;,—% +2, )? 
i=1 Pp 
The bias of this expression depends on parameters irrelevant to the estimation of oj, since 


(%;,;-%; —x% ;+2 )? (p—1)(q—1)? (p—1 
E aa o2+ o?. 
z p Pe Te 37 


It follows easily that an unbiased estimator s3 of of is (for q> 3) 





for 4 = I, ...,¢. 





Ps q Dp " ] q@ p | 
. ee Xj4,;—-X; —X ;+x )?»——— Uy — XL; —Ly~t+X, )*} 
i= pam wrest) qa— 1) telly 
a Ee (€;;-—€;,-—€,;+€ ae > > ( (€4—€;,—€4+€ y2\ 
(p—1) (q—2) i 8 Ee GD) SS 1. Ca Tey 


The distribution of s} appears to be complicated and has not been obtained; as is not 
unusual with more complex variance estimators, negative sample values are possible. For 
large values of p it seems reasonable to assume that the distribution of s7 tends to a x? form 
and finally to normality, as s} is then formed essentially by the addition of independent 
squares. Both as a measure of the efficiency of the estimator and for approximate tests of 
significance, it is therefore desirable to obtain the sampling variance of s3. 


Assuming normal distributions for the ¢;;, we obtain by straightforward calculation that 














2 ] qa : \ 
i 2 = 2 Fue y 2 72 
ong oi ayaa dott aap EF 

inochi 4(q—3) alesties 2 wigs aici 

. ~ B= * MDH GBP * HH qa rt 
where 

is 1 a 

o* =— > of, varo} = —— ) (o}-0*)? 











350 Unbiased estimation of heterogeneous error variances 


and for estimates s}, s?, calculated from the same data, 


2 
veri) = Gaya t 8) + oe “Hat Dqrae’ a3 + 03) 2% 

7. When there are only two series of parallel observations (¢ = 2) the method of the last 
paragraph breaks down. Here we cannot ignore the information in the data concerning 
the relative reliability of the two series, by using as estimates of the a; the plain means 
€;, = F(X +X). 

The expected values of the sums of squares of the observations x;; in each of the two series 
about their respective means are 


Pp 
E > (%3—2,;)? = (p—1)o9+(p—1)vara;, for j = 1,2, 
i=1 


Pp 
where var «; stands for the algebraic expression } (a@;—« )#/(p— 1). Wehave also for the mean 
i=1 
values of the two series* 


E (x1; —x% = (P= 1) 2 4 92) 4( —1)vara 
Pet ie | 1+ Og) tT (p is 
and therefore an unbiased estimator s/? of 0%, say, is 


Pp 
a = 1) 2 (%j4—-2.1) (%j1— 1 —Xjg+Z 9). 


(p— i=1 

When we have q series of observations (q> 2) we can write 

: 1 q \ j 

of = 2,;— 2 3)?» -—_—_— Lip— 2) (Lig—2X,)}, for j = 1,...,¢. 

’ Pa -1) x . ; 3)" ceca BA . : M ere “| J . 
This estimator seems to be the easier to calculate numerically, as in the form 

oP =F (ey—2,) Gat ys .—2 PE E eu) 

A =D Tae A ee 


it involves no cross-products. 
The sampling distribution of sj? depends, however, not only on all the parameters 


o#(t = 1,...,q), but also on the sample constants a,(i = 1,...,p). For instance, assuming 
normal error distributions, 


, 204, (p—2) vara, 2 
var sj" ~@-—)) ee {(q- a}+ ota eleatht sn oe 
and var (sj? — 3,2) = at ied fl ge ang SER 


(p—1)  p(p—1) 
where s? and s;? are calculated from the same data. It may be noted that the last expression 


is independent of the parameters o?(t+j,k). The difference 7-0? can, of course, be 
estimated by the expression 


— 
“ oe 2 
which is equivalent to sj? — 8,2. 
It should perhaps be emphasized that in practice the values of the parameters «, are 
frequently quite arbitrary in the sense that they cannot be considered to be a random sample 


* J am indebted to Mr F. J. Anscombe for suggesting this step. 


last 
ning 
eans 


aries 


1ean 


ters 
ning 


sion 


, be 


are 
nple 


A. S. C. EHRENBERG 351 


from some distribution whose variance is estimated by var «,. The efficiency of the estimators 
s;2 may therefore vary from one set of observations to another, even if the of themselves 
remain constant. 

8. We have deduced two different unbiased estimators of the oj which are quadratic 
functions of the observations. Before considering quadratic estimators more generally, we 
shall discuss a common indirect method of estimating variances, using the sample range. 
Estimation based on the range is useful and numerically economical when the data are 
numerous. If the data for each series are in the form of n random samples of m observations 
each, we calculate the mean range (over the n samples) of m random observations; the pro- 
cedure is most efficient when m is small. 

In our problem we can, for example, consider the range of the quantities (#,;—2,_) over 
a sample of m, 

r; = Max. (%,;—2,,)—Min. (x,;—2;) (i = 1,...,m). 
Now (x,;—2;,) is distributed (assuming normality) as N(f;—£ , 27), where 


1 
. q(t-?) oj+o°}, 
so that for the range r; we have r; = k,,S;, 


where k,, is a constant conversion factor (Pearson, 1931, Table XXII), depending on the 
number m of observations in the sample, and S; is an unbiased estimator of Z;. 
If S? is calculated from the mean range 7; of n samples (n> 1), then 


E(S8%) = (14°) 23, 


where v X47 is the sampling variance of S;. (This can be estimated from the standard error 
of the range, which is given for selected sample sizes min Pearson, 1931, p. exvii.) In practice, 
for large n, v/n will be small. 
We have therefore that an unbiased estimator of o7 is (for q > 2) 
q PS gf Gore gree 
(q—2)(1 rome 4 ag 

The sampling variance of this expression has not been obtained; as the statistics r, are not 
independent it is likely to be complicated algebraically, but it can be presumed that this 
estimator is less efficient than the related estimator of §6, as is usual in variance estimation 
by range and by second sampling moments respectively. 


THE GENERAL QUADRATIC FORM 


9. To estimate the variance a7 we can consider the general quadratic function of the pq 
observations 2;;, 


p 
Q= z A, xi, + >» jt = Vip big + > C, = MT met 50 > - Tip Lins 


r<s r<s om 


where the coefficients A and C are constants. (The coefficients of the elements of > ai, etc., 
i=1 
must clearly all be the same if they are to be independent of the parameters x,.) The condition 


that the expected value of Q (now called Q;) should equal oF gives 


Q; = 7 (45 — x ;)* Si nie A Ay, b (Xj, Ly) (Lig— Xs), 


aa aol 








352 Unbiased estimation of heterogeneous error variances 


q 1 
where z A,, = ia 
If we put all A,, equal to — 2/pq(q— 1), we obtain Q,; = 37? (§7). For two series only there is 
only one coefficient, A,,, so that s;? is the only possible quadratic estimator of of when q = 2. 
On the other hand, the form s} of §6 follows from putting 


-—2 2 
—— and A,, =——————— _ (r,8+}). 
(p—1) == pq-n@-a 9) 
If the coefficients A,, are to be independent of the variances 0?, all A, and all A,,(r, 8 +)) 
must be equal, to A; and A say, and we can write the constraint on the coefficients in Q; as 


te (@-1)@-2),__1 

(q-1)4,+=—5 A = 2 
Even now there are an infinite number of possible forms of Q,, but only one further one 
appears to be worth mentioning. If we put A; = 0, we obtain another generalization for more 


than two series of the expression s;* for g = 2 (see §7), namely, 


“ees Today - hag 
= G1) q@-y ©, 4) (Gy — %.—% y+, ). 


This may clearly be regarded as intermediate to s5 and s;?; it is not as simple to calculate as 
the latter, and its distribution still depends on the parameters «,;, as may be seen from its 
sampling variance, which is given by substituting the appropriate coefficients A,, in the 
general expression below. 


10. The sampling variance of Q;, which again follows after straightforward if lengthy 
algebra, is 





204 (p—2) ( q ya g(a \2 p 2 
varQ, = + vara, (| 2+ Ay) 2+ 4,,) oi\4 2 A?, oc? 62, 
= yt p= |(2+PZ Aa) 7 HP EE An) + yy) E ABor ot 
q 1 
where >y A, =—--- 
r<s Pp 


The variance of any unbiased quadratic estimator Q, of of depends therefore in general on 
other parameters, namely, all the error variances and the parameters @,(i = 1, ...,p), and it 
cannot be minimized for all possible parameter values by any one choice of coefficients A,,. 
We can choose these coefficients so that the variance of the estimator is independent of the 
parameters «,; this again gives us the estimator sf of §6. If we minimize the last term 

> At,o303 

r<s 
(for any given set of variances), we have all A,, equal and obtain s}? of §7. 

If for any given set of data we wish to estimate the error variances of as accurately as 
possible, we can first estimate them by one of the special forms of Q; which have been 
mentioned. We can then substitute these estimates in the expression for var@Q, and 
minimize this with respect to the coefficients A,,, which will now not necessarily all be 
equal to A, or A (cf. §9). The solutions of repeated application of this process should 
converge, but I doubt whether the labour involved would in the majority of cases be worth 
while. 

In practice all A,, are likely to be chosen smaller than 1/p, to give efficient estimators; in 
that case Q, will be consistent (as p—> 00), since its sampling variance tends to zero. 





pa 


ere is 


= 


8+)) 
Q; as 


r one 
more 





A. 8. C. EHRENBERG 353 


11. Although no quadratic estimators with minimum variance for all values of the 
parameters exist, it is of some interest to consider the efficiency of the two estimators sf and 
s;?, which seem to be the most suitable. 

xcept when var «; is small, s? will be more efficient if ¢ is at all large; in fact for large q 





var 8? >———o4, 
>: Ie 
" oF {, .. (p—2) 
whereas ee > + 208+ ; var a;}. 


‘The expression 20}/(p— i) can be taken to be a lower bound of the sampling variance of any 
estimator of 07, since it is the variance of the maximum likelihood estimator when we know 
that all the «; are equal. If the a; have to be estimated or allowed for, the accuracy of the 
estimation of the of must be reduced. 

The relative importance of the terms additional to 204/(p—1) in the variance formula 
clearly depends on which of the o} we are estimating. To keep the discussion general we shall 
therefore consider the average of the a5, o?. If we expect that all «; are more or less equal, so 
that var x, is small (compared with o*), s;? would seem to be the best estimator to use. It may 
often be possible to divide the data into batches within each of which var a; is small, and we 
can then calculate s;? separately for each batch and average. However, if these batches are 
small, the loss of efficiency (given approximately by (1 — 1/m) if the batches are all of size m) 
due to such a procedure may be larger than that involved in using s?, whose distribution is 
independent of the «;, for all the data. 

When q = 2 and vara; is not negligible, the efficiency, relative to 204/(p— 1), of s;? will 
only be less than about 50 % if var, is large compared to o4, and the efficiency of s} when 
q = 3 and q = 4 will on the average be about 30 and 50 %. As q increases the efficiency will 
be rather higher, but it should be noted that 204/(p— 1) may in any case not be an attainable 
lower bound to the variance in these special cases. It may not be necessary therefore to 
take the search for efficient estimators of o} any further. 


THE ANALYSIS OF VARIANCE 


12. It is of interest to consider the use of the analysis of variance technique on this type 
of data, where in a two-way (or higher) classification (series and samples) the error variances 
are not equal in one of the classifications (i.e. the series). 

There are two questions that can be asked: 

(a) Is the use of the usual analysis of variance technique, based on homogeneous (normal) 
errors, valid? 

(6) Does the technique still give the sort of answers that we desire, and does it do so 
efficiently ? 


13. On the usual null hypothesis (i.e. all «; and £;= 0) all the mean squares (i.e. between 
series, between samples and residual) are unbiased estimates of the average error variance 
of the observations 1 

a= gun rere +02), 


But the ‘between series’ and the ‘residual’ mean squares do not follow the usual y*- 
distributions. On the other hand, the ‘between samples’ mean squares and, in a higher 


Biometrika 37 23 











354 Unbiased estimation of heterogeneous error variances 


classification, any mean square orthogonal to the error-heterogeneity, are distributed exactly 
as o*y?/(p—1) with the usual number (p—1) of degrees of freedom, so that in a higher 
classification exact F’- and t-tests can be based on these various main effects and interactions. 

The sampling variance of the residual sum of squares in the analysis of variance, which 
can be written 








(p— we 1) > s? (8? is the §6 estimator), 
j=1 
$3 2 
is 2(p-1)(q~1)(02)*14-4 7) var % ' 
q a. 
7 oj = 1 ‘ oF iF 
where vero? @q—1) 7, (3 i 


both o* and var 03/0? are merely algebraic expressions of the finite number of parameters 03. 

At least, for fairly large numbers of degrees of freedom (where most sampling values lie 
close to the mean), one will probably be content to approximate to the distribution of the 
residual sum of squares by a x?-form of distribution with the same mean and variance 
(cf. Satterthwaite, 1941, 1946). It follows that the distribution of the residual mean square 
can be approximately represented by a o?y?/v distribution with 





(p—1)(g—1) 
2 
14 93) var 
q a” 


degrees of freedom, where var o3/a has to be estimated. 
The ‘between series’ mean square is affected similarly; its expectation on the null 
hypothesis is the average error variance o”, and its sampling variance is 


(q—2) oF 
XHq— 2y2/) 4 S27 Fp dl 
(q—1)(o*) + : Meo, 


The appropriate number of degrees of freedom of an approximating oy?/v’ distribution is 
therefore v’ = v/(p—1). 

A variance-ratio test for all the series constants or biases £,(j = 1, ...,q) can be based on 
these degrees of freedom. But tests of significance for individual constants £; based on the 
estimate of the average error variance a? may, as Cochran (1947) has pointed out, be 
misleading. Here estimates of the individual variances o} for each £; should be used; because 
of the complexity of any x?-approximation to the distributions of variance estimators such 
as those of §§6 and 7, and the considerable errors that would be involved in the estimation 
of the relevant degrees of freedom, only large sample tests based on standard errors appear 
likely to be possible. 

-14, The departure from the usual x?-distribution, which is measured by the correcting 
factor (1 + (q-2) " 2) var 74) , may in practice be numerically small. It seems likely that in many 
cases the ratio of the largest to the smallest error variance will at most be of the order of five 
or ten. For example, in taste-panel experiments (see §1) on wet white fish (Shewan et al. 
1950), the following estimated error variances for seven tasters (series) were obtained: 


0-30, 0:57, O81, 1-05, I-11, 1-15, 1-44, 


) 


} 


for 


null 


A. 8. C. EHRENBERG 355 


. —2 o? ? ? G Bit 
for which a var = 0-127, using the estimates of the o. Again, in some broccoli 


assessments and in some egg-tasting tests (Ehrenberg, 1950a, b), the error variances, for four 
panel members in each case, were estimated to be 


0-22, 0-43, 0-54, 0-69, and 0-15, 0-22, 0-32, 0-36. 


The correcting terms here are 0-009 and 0-001. The actual effect on the y?-distribution is 
better measured by the standard deviation than by the variance, and for small values of 
1-2 var the increase in the standard deviation is only about half this correcting term. 

Since for fairly large numbers of degrees of freedom relatively small changes in them do 
not affect a x?-distribution strongly, F- and t-tests of significance (with the exception of 
t-tests for the differences between series constants /;) in the analysis of variance of data such 
as have been considered here should be valid. 

For a given amount of variance heterogeneity (var a3/o*) the departures from the usual 
y’-distributions will be least for small numbers of series q; in fact, when q = 2, all the various 
sums of squares in the analysis of variance will be distributed exactly as oy? with the usual 
number of degrees of freedom, where o® = $(0? + 03). 


15. The analysis of variance technique should, as indicated in the last paragraphs, still 
give tests for the effects of the samples or the series (the «; and £; constants), and unbiased 
estimates of the residual error (here a pooled error). However, unless we require results which 
are valid under all the (factorial) conditions of the experiments, we may obtain more sensitive 
tests by the omission of some of the more erratic data. But in practice tests against the pooled 
o* will be most powerful unless the error variances of the series differ very widely indeed. 
For instance, if there are two series with error variances o? and o3 respectively (07 < 03 say), 
then only if 03>30? will a test of significance for «,—«a,, be more powerful if based on 
%—Lm, and o?, than a test of the means x, —2,, against }(0?+ 03). In all the examples 
quoted in the last paragraph, tests for the differences between samples based on the panel 
means will be more powerful than if the more erratic tasters had been ignored. 


ERROR HETEROGENEITY IN MORE THAN ONE CLASSIFICATION 
16. Instead of the population model of §1 we may have to consider a model 
y= A+B +d ;+6; ((=1,...,95 9 = 1,..-9) 
where E(68,;) = E(e,;) = 0, E(6})= 73, E(e};) = oj. 
In this case it is impossible to obtain quadratic unbiased estimators of of or of 73, but if all 
7? are equal, the problem reduces essentially to the one discussed earlier. Any of the unbiased 


estimators mentioned earlier in this paper will estimate o} + 7? (or 7?+ 0°) if applied to data 
of this form. We can therefore also estimate 


Or—Os (r+#8) Or. TET (+m). 


In an analysis of variance of a set of such observations the residual mean square gives an 
unbiased estimate of the average error variance ; 


otra Sots t Fn 
ta” fa 


23-2 











356 Unbiased estimation of heterogeneous error variances 


The variance of this estimate is 


2 feat s 9m\s, ¢—2) (p—2). 
———__—- | (o? + 7?)? + ——— var of + varT?/, 
(—@-y\ tt G iv? 
q 2. 42 
where var oF = Sy nf Bade SO 


j=1 (Q—-1) 
The residual mean square can therefore be regarded as being distributed approximately as 
(o? +7) x2/p” with 
ar (p= 1) (q-1) (2 +7?) 
(q—2) (p— 2) 


(o? +77)? + ge var oF + —— var T? 





degrees of freedom. 

Similar approximations can be made for the distributions of the ‘between series’ and 
“between samples’ mean squares. For the comparison of any two constants «,, «,,, or ,, £,, 
appropriate variances will have to be estimated, as the residual error estimate may again 
give distorted results. 

Generally, we shall not have to assume a model of two superimposed error distributions 
unless the of and 7? are of the same order. Numerically, the corrections to the degrees of 
freedom (which can, by the way, be estimated, being of the form o?—0?) may again be 
relatively small; considering, for example, values both of the GF and the 7? of 0-25, 0-5, 1, 1-5, 
1-75 (the ratio of the smallest to the largest oj or 7? is 7) the approximate number of degrees 
of freedom for the residual error is about 14} instead of 16. 

The argument of this section can be generalized to more than two classifications. 


SUMMARY 


17. We have considered data in a two-way or higher classification without replications, 
where the error variances may differ between the classes of one set. It appears that the 
method of maximum likelihood breaks down in estimating these error variances. Two 
unbiased quadratic estimators and one based on range are derived; the family of unbiased 
quadratic form estimators is also discussed, and it is shown that in the case of two classes 
there is only one unbiased estimator. The sampling variances of the estimators depend on 
some or all of the parameters involved, and no generally ‘best’ estimator in the sense of 
minimum variance exists, but in practice it should be possible to apply with reasonably high 
efficiency one of the given estimators. 

The analysis of variance of data of this kind is discussed, and where the distributions of 
the mean squares depart from the usual x?-forms, degrees of freedom of approximating 
x*-distributions are suggested. As the corrections will often be small, the usual F- and t-tests 
should be valid, particularly when the numbers of degrees of freedom involved are large, 
except that tests for the differences between any two class-means in the ‘heterogeneous’ 
classification should not be based on the pooled residual error estimate. 

The arguments are briefly generalized for heterogeneity in more than one dimension. 


I wish to thank Mr F. J. Anscombe for his guidance, Dr H. E. Daniels for some critical 
advice and Dr John Wishart for his careful reading of the draft. The paper has arisen out of 
work carried out on behalf of the Department of Scientific and Industrial Research (Food 
Investigation Organisation). 


ely as 


>’ 


and 


= Bs, 


again 


itions 
2es of 
in be 
, 1-5, 
grees 


‘ions, 
t the 
Two 
iased 
ASSES 
id on 
se of 
high 


ns of 
iting 
tests 
urge, 


A. S. C. EHRENBERG 357 


REFERENCES 


Bartiett, M. 8. (1947). The use of transformations. Biometrics, 3, 39. 
Carrer, A. H. (1949). The estimation and comparison of residual regressions where there are two or 
more related sets of observations. Biometrika, 36, 26 (41). 
Cocuran, W. G. (1937). Problems arising in the analysis of a series of similar experiments.. J. R. Statist. 
Soc. Suppl. 4, 102. 

Cocnran, W. G. (1938). Some difficulties in the statistical analysis of replicated experiments. Hmp. 
J. exp. Agric. 6, 157. 

Cocuran, W. G. (1947). Some consequences when the assumptions for the analysis of variance are not 
satisfied. Biometrics, 3, 22. 

EsRENBERG, A. 8S. C. (1950). (a) A test of broccoli judges (unpublished). (b) Oiled egg experiments 
(1948). Analysis of data (unpublished). 

Kenpatt, M. G. (1950). Factor analysis. J. R. Statist. Soc. B (§36). (In the Press.) 

Neyman, J. & Scorr, ExizaBeru, L. (1948). Consistent estimates based on partially consistent observa- 
tions. Econometrica, 16, 1. 

Pearson, K. (1931). Tables for Statisticians and Biometricians, Part II. Cambridge University Press. 

SATTERTHWAITE, F. E. (1941). Synthesis of variance. Psychometrika, 6, 309. 

SatrerTHwairTs, F. E. (1946). An approximate distribution of estimates of variance components. 
Biometrics, 2, 110. 

Suewan, J. M., Macinrosu, Ruts G., Tucker, C. G. & EnRrensere, A. 8. C. (1950). The subjective 
assessment of spoilage of wet white fish and the development of a numerical scoring system. 


(In preparation.) 











[ 358 ] 


SAMPLING THEORY OF THE NEGATIVE BINOMIAL AND 
LOGARITHMIC SERIES DISTRIBUTIONS 


By F. J. ANSCOMBE, Statistical Laboratory, University of Cambridge 


1. INTRODUCTION 


The negative binomial distribution depends on two parameters, which for many purposes 
may be conveniently taken as the mean m and the exponent k. The chance of observing any 


non-negative integer r is sade ay T(k+r) Bond a1) 
fs k} rl T(k) \m+k)° 


Sometimes it is more convenient to replace m by p or X defined by 





p=", ah er tess (1-2) 
Thus we may write P. = (1-X) he Xx’, (1-3) 
We assume k, m, p> 0, 0< X <1. The factorial-cumulant-generating function is 
In B{(1+0y}= ¥ ati! = —kln(1—pbt), (1-4) 
and the ith factorial cumulant is Kt) - (¢—1)! kp*. * (1-5) 
The generating function of ordinary cumulants* is 
In E(e*) = Sati! = —kIn{i—plet—1)}, (1-6) 
and the first four are Kk, = kp =m, 
Ka = kp(1+p) = m+m?/k, (1-7) 


Ks = kp(1+p)(1+2p), 

K, = kp(1+p)(1+6p+ 6p?). 
From (1-4) or (1-6) we see that the sum of N independent observations from the distribution 
has still a distribution of negative binomial form, with mean Nm and exponent Wk. 

The logarithmic series distribution of R. A. Fisher is obtained by a limiting process from 
the negative binomial distribution by considering a sample of N readings, letting N tend 
to infinity and k to zero, and neglecting the zero readings. It is a multivariate distribution, 
consisting of a set of independent Poisson distributions with mean values 

aX, 4aX*%, iakX*, ..... (1-8) 
A ‘sample’ comprises one reading from each Poisson distribution. 

* For a discussion of ordinary and factorial cumulants of a related distribution see Wishart (1947). 

Aitken (1939) and Haldane (1949) have pointed out that discrete distributions are often more con- 


veniently described by factorial than ordinary cumulants, and this proves to be so for the distributions 
considered in §2. The following relations may be noted: 


Ky = Ki 
Ke = Kigi + Xi); 
Kg = Kjg) + 3kjg) + Ki}, 


Kg = Ki + BKjg) + 7Kjg) + Ki). 





F. J. ANSCOMBE 359 


The main purpose of this paper is to carry somewhat further Fisher’s investigations 
(Fisher, 1941; Fisher, Corbet & Williams, 1943) into the sampling properties of these dis- 
tributions. The following is a brief summary of contents. 

In §2 the negative binomial form of distribution is compared with seven other two- 
parameter forms of distribution that have been proposed by various writers. It is shown that 
they can be arranged in order of increasing skewness and tail length, and that they vary in 
the number of modes possible in the frequency function. Thus while Neyman’s Type A 
contagious distribution may have an unlimited number of modes, a distribution given by 
Pélya may have either one or two modes, and the negative binomial and a discrete form of 
the lognormal distribution have always one mode. The estimation of the distribution of 
local mean values in heterogeneous Poisson sampling is considered. 

In §3 the estimation of the parameters of a negative binomial distribution from a single 
large sample is considered.* Alternatives to the maximum-likelihood method are described 
and their efficiencies indicated. Three such methods are found to be of practical importance: 
estimation by the first two sample moments, estimation by the first sample moment and the 
observed proportion of zero readings, and estimation with the aid of a transformation of the 
observations which makes the variance independent of the mean. 

In § 4 two large-sample tests are described for discriminating between alternative forms 
of parent distribution. Each test is fully efficient in certain circumstances. 

In § 5 the estimation of a common exponent from a series of samples is considered, when 
the parent populations possibly differ in their means. The results of §§ 3 and 5 have already 
been summarized by me elsewhere (1949), and their use discussed. 

Finally, §§6 and 7 deal with the logarithmic series distribution. The estimation of a by 
maximum likelihood, and some alternative formulae for its sampling variance, are dis- 
cussed. Two tests of departure from the logarithmic series form of distribution are con- 
sidered, one of them being due to Fisher and the other new. 

Notation. The following notation will be used for the negative binomial distribution: 


m, k, p, X, P. as defined above. 
N = total number of observations in sample. 


n, = number of observations equal to r (for r > 0). 


¥ = > n,r/N = mean of sample. 
co) 

3? = > n,(r—7)?/(N —1) = variance estimate. 
r=0 


For other distributions in §2, kand pare defined so that the mean = kp, variance = kp(1 + p). 
The notation for the logarithmic series distribution will be: 


a, p, X, n, as defined above. 


eo 
S= > n,. 
r=1 
ao 
[=> ny. 
r=1 


* The investigation may be compared with that of Shenton (1949) for Neyman’s two-parameter 
Type A distribution. 








360 Negative binomial and logarithmic series distributions 


An estimate of a parameter will be denoted by the same symbol with circumflex added. 
Different estimates of the same parameter are not distinguished in the notation, but only 
by context. 


2. COMPARISON OF NEGATIVE BINOMIAL, WITH OTHER DISTRIBUTIONS 


A number of ways are known in which the negative binomial distribution can arise: 

(1) Inverse binomial sampling. If a proportion @ of individuals in a population possess 
a certain character, the number of observations in excess of k that must be taken to obtain 
just k individuals with the character has a negative binomial distribution with exponent k 
(Yule, 1910; Haldane, 1945).* 

(2) Heterogeneous Poisson sampling. If the mean A of a Poisson distribution varies ran- 
domly from occasion to occasion, a ‘compound Poisson distribution’ results (Feller, 1943). 
We obtain a negative binomial with exponent k if A has a Type III distribution, proportional 
to a x” distribution with 2k degrees of freedom (Greenwood & Yule, 1920). 

(3) Randomly distributed colonies. If colonies or groups of individuals are distributed 
randomly over an area (or in time) so that the number of colonies observed in samples of 
fixed area (or duration) has a Poisson distribution, we obtain a negative binomial distribution 
for the total count if the numbers of individuals in the colonies are distributed independently 
in a logarithmic distribution (Liiders, 1934; Quenouille, 1949). 

(4) Immigration-birth-death process. A certain ‘simple model of population growth, in 
which there are constant rates of birth and death per individual and a constant rate of 
immigration, leads to a negative binomial distribution for the population size (McKendrick, 
1914; Kendall, 1949). The model has been applied to the growth of some living populations, 
e.g. populations of bacteria, and to the spread of an infectious disease in a community. 

The first of these, inverse binomial sampling, is the simplest mathematically, and is the 
only one where the mathematicai model is likely to hold exactly in practice. While 6 may be 
unknown and require estimation, & is known, and the estimation problems discussed in the 
present paper are irrelevant. Inverse binomial sampling will therefore not be considered 
further. 

In general heterogeneous Poisson sampling A may be supposed to have a distribution 
dU(A) with mean m. If A has a cumulant-generating function 


g(t) =InEBe® = 8 Kyfi/it!, (21) 
i=1 


then ¢(t) is the factorial-cumulant-generating function of the distribution of the observed 
count 7, and ¢(e'— 1) the generating function of ordinary cumulants. Hence if A has a Type 
III distribution with cumulant-generating function 


o(t) = —kln(1—mit/k), (2-2) 


r has the negative binomial distribution required. 
With the model of randomly distributed colonies, let the number of colonies observed per 
sample have a Poisson distribution with mean m,, and let the number of individuals p per 


* It appears from I. Todhunter’s History that the earliest general statement of the negative binomial 
distribution in this connexion was by Montmort in 1714. As to the other methods of deriving the dis- 
tribution, a more detailed review has been given by Irwin (1941). 





colo} 
cum 


then 
tota 


We 
loga 


whi 





ded. 
only 





F. J. ANSCOMBE 361 


colony have a distribution with frequency function u,. If the latter distribution has factorial- 
cumulant-generating function 


W(t) = InE(L+t = ¥ Lili! (2:3) 
i=1 


then m,(e¥—1) is the factorial-cumulant-generating function of the distribution of the 
total count 7, and the first four factorial cumulants of r are 


mL,, m(L,+L}), m(Ly3+3L,Ll,+Lj), m(L,+4L,L, + 3L§+ 6L,L4 + L}). 


We obtain the negative binomial with parameters p and k if m, = kln(1 +) and if has the 
logarithmic distribution 


ee 1 (;2,)’ (p>1) (2-4) 
e ~~ pin(i+p)\l+p ee 


which has mean m, = p/In (1+ p) and factorial-cumulant-generating function 


In (1 — pt) ‘a 
v(t) = nr (2-5) 

Models (3) and (4) for the negative binomial are closely associated, in that we may use 
(4) to justify (3). But it will be convenient below to consider models of randomly distributed 
colonies without specifying an evolutionary stochastic process that could give rise to it, and 
so the two models have been separated. 

While we may expect that distributions closely resembling the negative binomial will 
often in fact be observed in population counts and in the sampling of heterogeneous material, 
it will not be surprising if sometimes the specific assumptions made in the above models are 
so wide of the mark that a substantially different form of distribution appears. Before 
embarking on a detailed study of the sampling properties of the negative binomial it will be 
as well to consider briefly what other distributions have been proposed that might perhaps fit 
such observations better. Attention will be confined to distributions having only two adjust- 
able parameters; there seem to be seven of these outstanding in addition to the negative 
binomial. A convenient method of comparison is to express each distribution in terms of 
parameters k and p such that the mean and variance are kp and kp(1 + p), and then evaluate 
the third and fourth factorial cumulants. Results are shown in Table 1. The distributions 
can also be compared by computing specimen frequency functions; this is done in Tables 
2 and 3. 

The two-parameter contagious distribution of Type A of Neyman (1939) arises from the 
model of randomly distributed colonies in which the number of colonies per sample has 
a Poisson distribution with mean m,, if we assume that the number of individuals per colony 
also has a Poisson distribution, say with mean m,, m, and m, being positive constants. 
A derivation along these lines has been given by Cernuschi & Castagnetto (1946), who, 
however, appear not to have recognized what they derived. The distribution can also arise 
from heterogeneous Poisson sampling, if A has a discrete distribution and is equal to m,z, 
where z has a Poisson distribution with mean m,. This is more or less the model Neyman used 
in deriving the distribution. The frequency function is, for r > 0, 


_ m —m . a —M) 2-6 
P. re Dy (me y, (2-6) 


and its factorial-cumulant-generating function is 


m,(em# — 1). (2-7) 














362 Negative binomial and logarithmic series distributions 


To express the mean and variance in the form kp and kp(l1+p) we must set m, =k, 
Mg = Dp. 

Neyman’s two-parameter contagious distributions of 'i'ypes B and C were derived from 
a more complicated model. The factorial-cumulant-generating function of the Type B is 


Met _ | 
m, {= -1}; (2-8) 


and we find #m, = k, 3m, = p. The factorial-cumulant-generating function of the Type C 
distribution is 
foe 1—mgt \; 
1 ? 


(met)? 

and §m, = k, 4m, = p. These two generating functions, (2-8) and (2-9), can be derived from 
that of Type A, (2-7), by a suitable integration. Thus to get (2-8) we replace mz, in (2-7) by 
x and m, by (m,/m,) dx, and integrate for x between 0 and m,; while to get (2-9) we do the 
same except that m, is replaced by [2m,(m, — x)/m3] dx. The observed variable r can therefore 
be regarded, in each case, as the limit of the sum of a large number of random variables 
following independent Type A distributions with values of the second parameter distributed 
in a range (0, mz). 

Recently, Thomas (1949) has proposed a distribution very similar to Neyman’s Type A. 
With the model of randomly distributed colonies, where the number of colonies in the 
sample has a Poisson distribution with mean m,, ‘the number of individuals per colony is 
assumed to be one plus an observation from a Poisson distribution with mean m,— 1. mg is 
now a constant > 1. The factorial-cumulant-generating function of the distribution is 


m,{(1 +t) m0 1}, (2-10) 


If m, is large the distribution is close to the Neyman Type A with the same values of the 
parameters m,, m,. For m,—1 small, we may set 





(2-9) 


M, = 1+4p+ $p?+O(p*), mm, = up 


Pélya (1930) gives a distribution that arises from the ric! of randomly distributed 
colonies when the number of individuals per colony has » gvometric distribution with 
frequency function 


u, = (1—7)7e-1. (2-11) 
p takes positive integer values, and 7 is a constant, 0 << ~ 1. Lhe mean number of individuals 
per colony is m, = (1—7)~1. The frequency function of the total observed count r per sample 


is given by Per nd mer 
A = e~™, Fd = em p>) i) F (==) (r>1), (2-12) 
i= ! 


and the factorial-cumulant-generating function is 


PP. (2-13) 


We find m,/(2r) = k, 27/(1—7) = p. Pélya states that the distribution was given by A. Aeppli 
in a thesis in 1924, It will accordingly be referred to here as the Pélya-Aeppli distribution. 

Preston (1948) has considered a distribution derived from the model of heterogeneous 
Poisson sampling, where it is supposed that A has a lognormal distribution, i.e. that InA 
has a normal distribution, say with mean £ and variance o?. The distribution may be 


conve 
adval 





F. J. ANSCOMBE 363 


conveniently referred to as the discrete lognormal distribution.* It suffers from the dis- 
advantage that its frequency function involves an untabulated integral;t} for r > 0, 


merrier) os) u2 
pO pss ee pe age T—t pu , is 
f, r! ./(271n7) * exp, 2inr ® Ja. #44) 


where m = exp (£+ $07), 7 = exp(a?). However, the first few factorial cumulants are easily 
found, for they are the ordinary cumulants of the distribution of A, and these are obtained 
(Finney, 1941) from the moments of A about the origin, 


fi, = exp (16 + 4120?) = mire, (2:15) 
The mean and variance of A are therefore m and m?(7 — 1); and we have 
(r—1)2=k, m(r-1)=p. 


Another rather intractable distribution derived from the model of heterogeneous Poisson 
sampling has been given by Fisher (1931), who supposed that A was distributed like the 
square root of a Type III variable. The frequency function can be expressed in terms of 
Hh-functions, which have been tabulated to some extent. The cumulants involve ['-functions. 
For the entry in Table 1 the limit p— 0 with kp constant has been considered. The distribu- 
tion will be referred to as Fisher’s Hh-distribution. 








Table 1 

Distribution Kis\/(kp*) Kiai/(kp*) 
Thomas 3+3p+0(p?) $+0(p) 
Fisher Hh 1+k-!+O(p*) 0+0O(p) 
Neyman A 1 1 
Neyman B a 3 
Neyman C 8 8 
Pélya-Aeppli 3 3 
Negative binomial 2 6 
Discrete lognormal 3+k-i 16 + 15k-1+ 6k-2 + k-3 

















The third and fourth factorial cumulants of the above distributions are given in Table 1. 
It will be seen that, apart fromFisher’s Hh distribution, they form a sequence of distributions 
of increasing skewness and tail length (leptokurtosis) in the order shown. The position of the 
Hh distribution relative to the others is ambiguous and variable (ambiguous in that it 
depends on whether we rank by xj) or by x{4), variable because it depends on the values of the 
parameters), but we may say at least that it should come somewhere towards the front of 
the list. 


* Preston does not give any exact sampling theory. Other writers (e.g. Williams, 1937; Gaddum, 1945) 
who have alluded to lognormal distributions in connexion with frequency counts have contented them- 
selves with recommending that the data should be transformed by a logarithmic transformation of the 
form y = log(r+c), so as to appear approximately normal. It should also be noted that Preston is con- 
cerned with the situation where zero counts are not recorded and therefore the total sample size N is 
unknown. This will be discussed in §§ 6 and 7. 

t A usable approximation to the frequency function of the discrete lognormal distribution has been 
developed by Dr P. M. Grundy. 











364 Negative binomial and logarithmic series distributions 


The difference in shape between the distributions is clearly substantial if p is large. To 
demonstrate this further some expected frequencies are given in Table 2 for three distribu- 
tions having p = 10, mean = 20, variance = 220, namely, 

(a) Neyman Type A, with m, = 2, m, = 10; 
(6) Pélya-Aeppli, with m, = 3}, 7 = 8; 
(c) negative binomial, with k = 2, m = 20. 
Also shown is a distribution having mean = 20, variance = 218, namely, 
(d) Thomas, with m, = 2, m, = 10. 


To save space, the frequencies have been grouped. The Neyman distribution (a) has modes 
or peaks at r = 0, 10, 20, while at r = 30 there is a mode in first differences which is not large 
enough to produce a mode in the frequencies themselves. The Thomas distribution (d) is 
practically indistinguishable from (a), the difference being that the modes of (d) are slightly 
more pronounced than those of (a). The Pélya-Aeppli distribution (6) has two modes only, 
at r = 0 and 11. The negative binomial (c) has one mode, at r = 9 and 10 (equal frequencies). 
If a discrete lognormal distribution were added to Table 2 (with m = 20,7 = 1-5, = 2-7930, 
and o = 0-6368) it would resemble (c) in having only one mode, but would be rather more 
skew; the frequencies for the first few values of r would be lower. 


Table 2 





Percentage frequency Percentage frequency 








(a) (6) (c) (d) (a) (b) (c) (4) 





0 13-53 3-57 0-83 ‘4-53 17-18 5-04 5-52 5°77 5-09 
2 0-07 4-18 3°55 )-03 19-20 5-37 5-18 5-28 5-52 

3- 4 0-72 4-96 5-31 0-54 21-22 5-23 4-81 4-79 5-33 
6 2-74 5-52 6-34 2-56 23-24 4-74 4-42 4-31 4-74 
7-8 5-54 5-89 6-86 5-66 25-26 4:19 4-02 3°86 4-15 
9-10 7-01 6-07 7-01 7-29 27-28 3°78 3°63 3°43 3°77 
11-12 6-41 6-10 6-90 6-47 29-30 3-48 3-25 3-03 3°51 
13-14 5-18 6-00 6-62 5-00 31-32 3-19 2-89 2-67 3-23 
15-16 4-73 5-80 6-22 4-59 33-00 19-05 18-20 17-22 18-98 
l 






































A less extreme comparison is shown in Table 3, which compares distributions having p = 3, 
mean = 6, variance = 24, namely, 


(a) Neyman Type A, with m, = 2, m, = 3; 

(6) Pélya-Aeppli, with m, = 2-4, 7 = 0-6; 

(c) negative binomial, with k = 2, m = 6. 
(a) and (6) have two modes, (c) has one. 

In general, the Neyman Type A and Thomas distributions can have any number of modes 
from one upwards, and if there are several modes they will occur at values of r approximately 
equal to multiples of m,. The Pélya-Aeppli distribution, on the other hand, has either one 
or two modes—two if 2<m,<(1—7)-!, one otherwise. The negative binomial distribution 


has always one mode. Presumably, by analogy, Fisher’s Hh distribution has one or two 
modes, and the discrete lognormal one mode. 


bu- 


des 
rge 
) is 
tly 
ly, 
28). 
30, 
ore 


F. J. ANSCOMBE 365 


For the mere purpose of graduating data there is little to choose between the distributions 
in shape if p (=m/k) is not large, and then considerations of ease of handling weigh heavily 
in favour of the negative binomial. Experimental discrimination between the different 
forms of distribution is practicable, however, if p is large. An interesting attempt at such 
discrimination has been made by Beall (1940), who fitted Neyman Types A, B, C, Pélya- 
Aeppli, and negative binomial distributions to eleven series of counts of insect larvae. Some 
of the series seemed to indicate a bimodal population, and he concluded that they were well 
fitted by the Neyman forms of distribution, but not by the other two. Mr D. A. Evans has 
pointed out to me, however, that Beall fitted the latter distributions incorrectly, having 
mixed up the two parameters, and he was consequently unfair to them. 






































Table 3 
Percentage frequency Percentage frequency 
r D 
(4) (b) (c) (4) (6) (c) 

0 14-95 9-07 6-25 9 5°23 4-92 4-69 
1 4-47 8-71 9-37 10 4-42 4-12 3°87 
2 7°36 9-41 10-55 11 3-67 340 | 3-17 
3 8-77 9-49 10-55 12 3-00 2-78 2-57 
4 8-83 9-12 9-89 13 2-42 2°25 2-08 
5 8-29 8-46 8-90 14 1-92 1-80 1-67 
6 7-60 7-63 | 7:79 15 1-51 1-43 1-34 
7 6-86 6-71 6-67 16 1-17 1-13 1-06 
| 8 6-06 5:30 | 5-63 17-00 3-48 3:78 3°95 








In analysing population counts we may have two quite distinct objects. On the one hand, 
the counts may have been made on plots in an experiment, and we desire some means of 
interpreting them so that the effects of treatments can be judged. What is usually done is to 
apply a transformation to make the method of analysis of variance appropriate. We study 
the distribution of the original counts in order to find a suitable transformation. It does not 
matter greatly whether the form of distribution fitted, if any, is very accurate or not. 

On the other hand, we may be interested in relating the observed counts to some theory 
of population growth or spread. In that case we shall endeavour to use only forms of dis- 
tribution that are ‘biologically significant’. Neyman’s distributions were intended to repre- 
sent populations of insect larvae observed shortly after emergence from eggs, the eggs being 
supposed laid in clusters of a fixed size. The models seem rather special, and not likely to be 
widely applicable to other sorts of population counts. The characteristic feature of the 
distributions (at any rate, that of Type A) is the possibility of a series of three or more 
equally spaced modes. Unless such a series of modes is demonstrated conclusively by obser- 
vation, one may reasonably feel reluctant to use such a model. Thomas’s distribution was 
intended to represent plant quadrat counts, but no evidence has been adduced to make 
plausible the special form of wu, assumed, and again one may reasonably feel reluctant to 
use it. The derivation of the Pélya-Aeppli distribution from the model of randomly distributed 
colonies is much more promising. Kendall (1949) has shown that the progeny of a single 
individual after a fixed lapse of time will follow a geometric distribution with modified zero 








366 Negative binomial and logarithmic series distributions 


term, in certain fairly general conditions of no competition. Hence if progenitors (e.g. plant 
seeds) are released randomly over an area at one time and their progeny (freely increasing 
by vegetative reproduction) are observed at a later time, we shall expect the number of 
individuals per quadrat to follow the Pélya-Aeppli distribution. The parameter m, will be 
the mean number of progenitors per quadrat of which some progeny survive. If, instead of 
being released all at one time, the progenitors are released with uniform distribution in time 
from a particular time up to the present, and if the birth- and death-rates per individual are 
constant, we get the negative binomial, as already remarked at the beginning of this section. 
We may therefore expect that close approximations to both these distributions will in fact 
be observed in the study of growing populations. Of course, some population counts will not 
resemble any of the distributions we have considered, on account of overcrowding or, with 
mobile fauna, aggregating for reproduction, defence, or other social purposes. It is unlikely 
that any two-parameter distribution will describe such counts adequately. 

In view of the difficulty of discriminating experimentally between forms of distribution 
arising from different mathematical models, the study of counts made all at one time is not 
likely to give reliable information on laws of population growth. For this purpose, repeated 
observations on the same population are needed, if possible with identification of individuals. 

Sometimes, in sampling investigations, it is reasonable to suppose that the observations 
arise from heterogeneous Poisson sampling, but there may be no definite grounds for pre- 
dicting the distribution of A. If the observations are sufficiently numerous, the distribution 
of A can be estimated (Newbold, 1927). Let k,, k,, ..., be the k-statistics calculated from a 
sample of N observations on r (see, for example, Kendall, 1943). Then we can take as unbiased 
estimates of the first four cumulants K,, K,, K3, K,, of A the following: 


A 





K,=k,, 

Rk, = k,— ky, 

: (216) 
RK, = ky — 3k, + 2k, 

RB, = ky—6ky + 11k, — 6k;. 


The right-hand sides are in fact unbiased estimates of the factorial cumulants of the dis- 
tribution of r, analogous to the well-known k-statistics for ordinary cumulants. It is quite 
straightforward to calculate the variances (and other cumulants) of these estimates. If, for 
example, K, = m, K, = o*, and all the K;, for i > 3 are zero or negligible, we find (for N large) 


m+o? 
N > 





var (K,) = 


m + 0)? + 202 


var(K,) =~ e (2-17) 











A 2)\3 2 2 
var (i,) = 6(m +0) aad (m +3e0 ) 


One possible application of this method is to estimating the process curve of the output of 
a production line from inspection records which give the number of defectives found in 
samples of a fixed size taken from each lot produced. A similar biological problem in sur- 
veying a district for presence of potato-root eelworm has been described by me elsewhere 
(1950). Newbold developed the method in a study of accident-proneness. 





F 


F. J. ANSCOMBE 367 


3. FIvTING A NEGATIVE BINOMIAL INSTRIBUTION TO A LARGE SAMPLE 


Haldane (1941) and Fisher (1941) have considered the maximum-likelihood method of 
fitting a negative binomial distribution to a large sample. If the distribution is expressed in 
terms of the parameters m and k, the maximum-likelihood estimates of the parameters turn 
out to be independent (asymptotically). For the estimate m of m we have simply 


aA 


m=f. (3-1) 


The estimate k of k is the root of the following equation in z: 


Nin(1+7/z) = Sn(i+o5+-t (3-2) 


rig ers sya 
Itis easy to show that the left-hand side is greater than the right-hand side if x is large enough, 
provided (N — 1) s?> N7 (or, ignoring the difference between N and N — 1, if s?>7); and also 
that the left-hand side is less than the right-hand side if x is small enough (but positive), 
provided that n)<N, so that there are some non-zero observations. Since both sides are 
continuous functions for x > 0, the equation must have at least one finite positive root. I have 
not proved that there is only one root in this case, and that if s?<7 there is none, but after 
an unsuccessful search for a gegenbeispiel I suppose both these statements to be true. If 
8? <F (and if, indeed, there is no finite root), the excess of the right-hand side over the left- 
hand side tends to zero as x-> 00, and we may say ke = oo, implying that a Poisson distribution 
is being fitted. If ny = N, we may conventionally take Px 1, say. 
For the variance of m we have 
2 
var (m) = < (m +") : (3°3) 
From the matrix of expectations of second derivatives of the log-likelihood function, we 
obtain the large-sample formulae: 





cov (Mm, k) ~ 0, (3-4) 
~ (k+l ff, 4X 3X2 l “ 
var (i) ~ + sot exnasnt} (3:5) 


The summation involved in deriving the second of these is due to Fisher; the series in curly 
brackets may be written 
e 2 ji xi 
l+ > >; = 
joj +1 (B42) (R43)... (+3) 





(3-6) 


It may be noted in passing that large-sample variances and covariances found in this way 
relate to the asymptotic normal distribution of the estimates, and are not necessarily asymp- 
totically equal to the actual variances and covariances of the estimates for finite N. In the 
present case, in sampling from any negative binomial distribution, there is always a positive 
chance, albeit perhaps a minute one, of finding s* <7, when we should set k =. k does not 
therefore have a proper distribution, nor variance. It does, however, have an asymptotic 
distribution with the variance given, as N +00. " 

While equation (3-1) gives m very simply, equation (3-2) for k is tedious to solve, and it is 
worth while to consider alternative methods of estimating k. We begin with a general moment 
method. Let f, be any specified convex or concave function (not linear) of the non-negative 








368 Negative binomial and logarithmic series distributions 


oc 
integer r. Then we may consider > ~,/f, as a statistic for k. Let E(f,), expressed as a function 
r=0 


of mand k, be denoted by F(m, k). Then we shall take as our estimate ithe root of the following 
equation in w: 1° 
N > 2,f, = FF, x). (3-7) 
4¥ r=0 


The right-hand side of this equation is a monotone function of x, with no constant stretches, 
for x positive, provided 7>0. For if x is increased to x+dx, the change dP, in P.(m, z) is 
negative for 0<r<a and for r>b+1, and non-negative for a+1<r<b, being strictly 
positive for one of these values of 7 at least, where a and 6 are integers satisfying 0<a<b. 
Moreover, ¥ 6P. = > réP, = 0. It follows easily that if f, is convex from below ¥ f,6P. < 0, 
r=0 r=0 r=0 
while if f, is concave from below the inequality is reversed. Thus in either case the right-hand 
side of (3-7) is monotone, as stated; and therefore the equation has at most one positive 


solution &. On the other hand, in repeated samples from the same negative binomial dis- 
tribution, the probability is small that (3-7) has no positive solution, when N is large. For 
the left-hand side has a high probability of being close to F(m, k), and F(7, x) is a continuous 
function of 7 and x with a range of values, as x varies in (0,00), differing little from the range 
of values of F(m, x) if 7 is near to m. 


We can find the large-sample variance of the estimate k given by (3-7) by treating k—k 
and *—™m as infinitesimals. Denoting the latter by dk and 6m respectively, and n,/N — P. 
by 6P,, we have “ 

dm = > réP,, 
r=0 


A,bk = 3 (f,— Amr) dP, 
0 


r= 


7) ra) 
where A,, = am F(m,k), A, = age im *)- 
Now var(6P,) = P(l-P)/N, cov (8P,8P)=—PBIN (i,j>0, i*j), 


2 
while from (1-1) we find (m+"-) A,, = E(rf,)—mE(f,). 


Hence we obtain the desired results: 





cov (7, k) ~0, (3-8) 
a (f2) -[E(f¢.)2— 2 2 
var (k) ~ EL) [ ea (3-9) 


The ratio of the variances (3-5) and (3-9) is the large-sampie efficiency of method (3-7) 
of estimating k. As already remarked, m is easily estimated with full efficiency by equation 
(3-1). Since these estimates are in large samples independent, it is appropriate to consider 
their efficiencies separately. In general, when considering inefficient estimates of the para- 
meters of a distribution, a reasonable single measure of large-sample efficiency is provided 
by the square root of the ratio of determinants of sampling variance matrices for the maxi- 
mum-likelihood estimates and for the alternative estimates under consideration, if there are 
two parameters to be estimated, or the cube root if there are three parameters, etc. This 





On « 


The 


ani 








F. J. ANSCOMBE 369 


measure is the ratio of the numbers of observations required by the maximum-likelihood 
method and the alternative method to achieve the same error variance determinant. In 
the present case it would be equal to the square root of the efficiency of estimation of k. 
Let us now consider some examples. 
Method 1. f, = r*, so that k is estimated from the sample variance s?, i.e. 








~ FF 
=>: (3-10) 
On evaluating E(f,), E(f?), A,, A,,, we find the large-sample variance 
“~  2k(k+1) 
var (k) ~ Wx? ? (3-11) 


and the efficiency of estimation of k is equal to the reciprocal of the expression (3-6). 
Method 2. f, = 1, f, = 0 for r>1, so that k is estimated from the observed proportion of 




















as ng|N = (1+F/b)-F, (3:12) 
The large-sample variance is 
a (1—X)-*¥-1-kX : 
var ()~ a7—in(1 KX) —X}*" (3-13) 
Method 3. f, = 1/(r +1). kis given by 
oo —~xX)—-(1_-xXyFe ra 
x > ae 4} Siete. 2 , where . —.. (3-14) 
rsortl (k-1)X 7F+k 
The large-sample variance of kis given by (3-9), where 
_ (1-X)-(1-X) Sricctaz we Mat). err —)),, 
1-X 
ap kee < —~Xye-(1-— . 
m= Epa galt! + &— 1) X}(1-X)¥-(1- X)}, } (3-15) 
ER BRR ee _X)—(k-1)2XV(1—-X)¥-(1— 
A, = ueR pex lt b(k—1)In (1 ~ X) —(k—1)*X}(1- X)*¥-(1-X)]}| 
This is for k+ 1. Corresponding expressions when k = | are easily found directly. 
Method 4. f, = c’, where c is a positive constant not equal to 1. We find 
1S net = [1+(1—c) F/R; (3:16) 
N ;=0 
and for the large-sample variance 
E(f,) =(1+p(i-e)}*, E(ff) = [1+ p(1—e*)}*, 
ae) —¢)]-F-1 


A, = 1+p(1—e)}-*{ -In [1 +p(1—ey) + 2-9) . 


1+p(1—c) 

The above seem to exhaust the tractable applications of the moment method. Several 
other forms for the function f, suggest themselves as possibly worth investigating, such as 
(i) f, = 0, f, = l+3+a+ M, += for r>1, (ii) f, = In(r+1) for r>0, and (iii) f, = Jr for r>0; 
but unfortunately these do not lead to a simple expression for E(f,). Their practical use is 
therefore ruled out unless special tables are constructed for estimating k from &n,f,/N. 

Biometrika 37 24 














370 Negative binomial and logarithmic series distributions 


A method of a different kind is the following. 

Method 5. We guess a value of k, apply a transformation (depending on /) to the observa- 
tions to make the variance independent of the mean, and then obtain an improved estimate 
of k by equating the observed variance to the expected; the process is repeated until the 
answer becomes stable. Suitable transformations were investigated by me (1948) and sum- 
marized by me (1949). From a consideration of equation (4:37) of the former paper, which 
gives an asymptotic expansion for the variance of the transformed variable when k = 1, it 
appears that the transformation method is unlikely to be serviceable for k< 1. For higher 
values of k, the method is possible for not-too-low values of m. In principle the method could 
be used for any values of m and k, if the expected variance of the transformed observations 
were known as a function of m and k in a convenient form, e.g. by an adequate double-entry 
table. But such a table is not available, and so the method is restricted to those values of m. 
for which the limiting variance as m— co is near enough attained. 

The large-sample variance of the estimate of k derived by the transformation method will 
now be found. Let y denote the transformed variable, a function of the observed count r, 
when the true value of k is inserted in the formula for y; and let 9 be the transformed variable 
when an estimate & of k is used. Then in samples from a fixed negative binomial distribution 
with parameters m and k, we have, from equation (4-23) of my 1948 paper (setting A = $k), 

var (9) = w'(k)+ (oe + (=) ; (3-18) 
as m—>oo, if k>2.* Let s? be the variance estimate calculated from the observations after 
transformation to y-variables, and §* that calculated after the observations have been trans- 
formed to 9-variables. We choose k, by successive approximation, so that 

2 = w'(b). (3-19) 
Let 0s?/0k denote the derivative of s? (for the given sample of observations) with respect to 
the parameter k entering into the y-transformation, when k is set equal to its true value. For 
any arbitrary k, not necessarily close to k, we have in probability as N >0o 


§ = var wen aanie’ 


0s? 
and therefore aver (9) + ia 
Ok k=k 


Hence, if k is determined by (3-19) and we set k—k = 6k and s?—var (y) = do?, we have in 
probability for large N and m 


0s? kék 
)= # ae ~~ a eS 2 
W'(k) + "(k) dk = p(k —- + ay ok Y'(k) + 80 (k—1)?m 


k 
1)?m 
The large-sample variance of the left-hand side is found from equations (4-23) and (4-30) of 
my 1948 paper, and we obtain the result 


var ym Ya 
": [yw +e Tl 


* v(t), Y(t), ... denote the successive derivatives of In I(t). 


and therefore do? = \v"e) +—_—.—_ ik =i} Ok: 





(3-20) 








F. J. ANSCOMBE 371 


This should be sufficiently accurate for practical purposes if m-is above 50 and k above 5, 
assuming that the appropriate inverse hyperbolic sine transformation is used. It is not clear 
from existing calculations how far (3-20) may be relied on outside this region. 

‘We thus have an assortment of alternatives to the maximum-likelihood method of 
estimating k. Let H,, H,, H;, H,, H; denote the large-sample efficiencies of the five methods. 
By tabulating these efficiencies we can see under what conditions, if any, each method is 
likely to prove useful. In the figure are shown 50, 75, 90 and 98 % contours of H, and E£,, 
and 90 and 98 % contours of Z,;. The 75 % contour of H, has been found only for m very large, 
when it is close to the line k = 1; it is not shown in the figure. Since Method 5 can hardly be 
used when k< 1, no attempt has been made to determine the 50 % contour. 










































































40 : ' =. 
T 1 ee 
hel i ean 
i \ 1 : ae 
\ 11 
- ‘ , J \ 90% 
¢ 4+ — ee be —t= 
5 vets A et on: ite 
3 Y. 
A N 
“ LZ \ KN N S| 150%) 
K XN = Ts 50% 
N ™ ~ 
0-4 Y N pal ~ 42% eo ~~ — 
rs 4 ~ ote 
SS ~ 
% a , “Ral 96% - | 
0-1 Z Z = 





004 Of 02 04 1 i 10 20 40 100 200 400 


Mean m 
Fig. 1. Large-sample efficiencies of estimation of k. 
Method 1 ————— Method 2 ----—— Method 5 —. —---—- 


In considering the lower right-hand region of the figure, where m is large and k is small, 
it is helpful to rote* that if we set a = —kln(1—X) and let k>0, X +1, with a constant, 
the expression (3-6) is asymptotically equal to 2(1—e~*)/k, whence the limiting efficiencies 
of Methods 2, 3 and 4 are all equal to a*e*/(e*— 1). Other limiting forms of the efficiencies, 
for movement away from the centre of the figure in various directions, are easily found and 
need not be quoted. 

No contours are shown in the figure for Methods 3 and 4, since, as it turns out, these are 
nowhere more efficient than the more efficient of Methods 1 and 2, and the latter are more 
convenient to use. Method 4 becomes equivalent to Method 2 as c>0, and to Method 1 
as c->1. If we imagine c increasing continuously from 0 to 1, a constant-value contour of 
E,, such as the 90 % contour, departs from the corresponding contour of EF, by a translation 
of the uppermost part of the curve to the right (‘east’) and a pulling of the middle part of 
the curve downwards to the left (‘south-west’). (The first of these movements is easily 
expressed. The contour has a vertical asymptote for k large, of the form m = constant. This 
value of m is equal to the value for the corresponding contour of #, multiplied by (1 —c)-?.) 
As c increases, the contour eventually breaks up into two disjoint parts, and two new parallel 


* Proofs of this and other statements in the remainder of this section are omitted, to save space. 


24-2 











372 Negative binomial and logarithmic series distributions 


asymptotes appear, of the form m/k = constant. As c>1, the upper part of the curve 
approaches the H, contour, while the lower part recedes and vanishes in the limit. The Z, 
contours are similar in character to EZ, contours. 


4. TESTS FOR DEPARTURE FROM THE NEGATIVE BINOMIAL FORM OF DISTRIBUTION 


We have considered how a negative binomial distribution can be fitted to a large sample. 
We turn now to testing goodness of fit, in particular to detecting a departure from the 
negative binomial form towards one of the other forms discussed in § 2. Tests are required 
that are reasonably convenient to use. 

Particular interest attaches to discriminating between the negative binomial and the 
Polya-Aeppli forms of distribution. Let us suppose to begin with that m/k (=~) is small. 


Then the log-likelihood function of the observations on the hypothesis of a negative binomial] 
distribution is 


°o ce) re) oo = 2» 
L,= D2, InP. = Dn, rlnm—-Nm— Yn, In(r!)+ n,| aE p 
r=0 r=0 r=2 r=0 


m 2 


- 3M, r= 3r+2m |F+0(p%, (4:1) 


as p> 0 with m fixed. The maximum-likelihood equations for estimating k and p are asymp- 
totically those of Method 1 of the last section. Consider now the hypothesis that the obser- 
vations are drawn from a Pélya-Aeppli distribution. In terms of parameters k and p defined 
as in § 2, the log-likelihood function of the observations, L, say, is the same as the above 
expression for L, except that the term in p? is 


- Sn [aS wr) -r+m|®. (4:2) 


Again the maximum-likelihood method of estimating k and pis asymptotically equivalent 
to fitting by the first two sample moments. The likelihood ratio criterion for discriminating 
between the two distributions is found by maximizing LZ, and L, separately with respect 
to k and p, and then subtracting them. Its leading term involves the first three sample 
moments. We are thus led to propose the following test for departure from the negative 
binomial form towards the Pélya-Aeppli form, to be used when p is small: 

Test 1. Estimate the parameters of the negative binomial distribution from the first two 
sample moments (Method 1), and then compare the third sample moment with the estimated 
third moment of the distribution. 

There is no point in actually evaluating the likelihood ratio criterion just described, since 
we do not know a priori that the parent distribution is necessarily of one or other of the two 
forms considered. We may, however, apply a test similar to Test 1 to see whether or not the 
observations are in agreement with the Pélya-Aeppli or any other hypothetical form of 
distribution. 

We can similarly investigate the likelihood ratio criterion in another simple limiting case, 
namely, for P,>1 with m constant. We find, for both distributions, that the maximum- 
likelihood estimation of the parameters involves asymptotically only the two statistics 7 
and n,/N (Method 2 above). The leading term of the likelihood ratio criterion involves, in 





o 
addition to these, a further statistic, 5 ,Inr. This is not a convenient statistic on which 
r=2 





fre 


ve 





F. J. ANSCOMBE 373 


to base a test, since its expected value cannot be expressed simply. The only simple test that 
suggests itself is, in fact, one based on the sample variance, thus: 

Test 2. Estimate the parameters of the negative ‘‘inomial distribution from the sample 
mean and the observed proportion of zeros (} { 2), and then compare the sample 
variance with the estimated variance of the disa .. .on. 

Test 2 arises more directly in another problem of discrimination. We suppose that the 
parent distribution of the observations is a compound (or heterogeneous) negative binomial, 
ie. that each observation is drawn from a negative binomial distribution having the same 
exponent k but a mean that varies randomly from observation to observation in a distribution 
with mean m and variance ¢, say. The resulting distribution departs from the negative 
binomial form towards that of the discrete lognormal, having high skewness and kurtosis 
coefficients. We now test the hypothesis that ¢ = 0. If the log-likelihood function of the 
observations is expanded in ascending powers of ¢, the coefficient of ¢ is a function of the 
first two sample moments (and of the unknown parameters m and k). The optimum large- 
sample test of the hypothesis thus consists of fitting k and m by maximum likelihood, assuming 
that ¢ = 0, and then comparing the observed and estimated variances. In the region where 
Method 2 of fitting k is efficient, we would use Test 2 above. In the region where Method 1 
is efficient, i.e. for k large, a small heterogeneity in the mean of the negative binomial has 
the effect of apparently reducing & but not otherwise changing the shape of the distribution, 
so that the heterogeneity is not easily detectable. Test 1 could be used to detect a pronounced 
degree of heterogeneity in the mean. 

It remains to consider how Tests 1 and 2 are carried out. We shall suppose the sample size 
N to be large. The criterion of Test 1 is the difference between the third sample moment and 
its estimated value, i.e. is 292 
T==— Y2n(r- 7p—a(= ~_ 1). (4-3) 

N r=0 7 


Using the known formulae (quoted by Kendall, 1943) for the variances and covariances of 
k-statistics, or, alternatively, working in terms of the sample factorial moments (of which the 
variances and covariances are easier to find directly), we obtain the large-sample result 


N var (7') ~ 2k(k + 1) p3(1 + p)? [2(3 + 5p) + 3k(1 + p)). (4-4) 

The criterion of Test 2 is the difference between the sample variance and its estimated value, 

i.e. U = 8-7 Pk, (4:5) 
where k is the estimate of k by Method 2. We find, for W large, 

Neov(k,?)~0, Ncov(k,s*)~ —k(k+ 1) p?/{—In(1—X)— X}, (4-6) 


from which and (3-13) we obtain 


xX? (1—X)-*-1-kX i 
N var (U)~2k(k-+1)p%(1 +p) t- sack |+?| See . (47) 








As an example, we may consider a frequency distribution quoted by Williams (1944, 
Table 6) of the number of head-lice of all stages found in the hair of Hindu male prisoners on 
admission to Cannanore jail, South India, over a period from 1937 to 1939 (see Buxton, 1940).* 


* Prof. Buxton has very kindly allowed me to check the frequency distribution against the original 
records. There seems to be one error in Williams’s table, the number of heads without lice being given as 
622 instead of 612. Williams has also made a numerical slip in fitting a negative binomial distribution 
to the observations, so that the fit appears worse than it should. 











374 Negative binomial and logarithmic series distributions 


We have N = 1073, ny = 612, 7 = 6-93569, s? = 583-8. By Method 2, k= 0-144198, and 
hence U = 243-3, with estimated standard error 38-9. We conclude that the observations 
are not adequately fitted by a negative binomial distribution, being too skew. The result is 
not surprising, in view of the heterogeneity of the prisoners in caste and in other respects 
likely to affect personal hygiene. 


5. FITTING A COMMON EXPONENT TO A SERIES OF SAMPLES 


We suppose now that we have a series of samples, one from each of v negative binomial 
distributions. Characteristics of the ith distribution and the sample from it will be denoted 
by the usual symbol with suffix i added. We may be interested in investigating how k,; is 
related to m,. If the k, are not too small we may estimate them by Method 1 and plot them 
against 7;. We shall obtain a less skew distribution of errors if the reciprocal of the estimate 
of k,, rather than the estimate itself, is considered, i.e. (s?—7;)/73. In either case, there is 
a bias of order Nz? which may be worth removing if v is large. The following estimate of k;?, 





(5-1) 


is approximately unbiased, having expected value kj1+O(N;*). This is easily proved 
(dropping now the suffix i, for convenience) by writing 7 = kp+ém, s* = kp(1+p)+4do°, 
and expanding (5-1) in ascending powers of dm and .do*. To the first order when N is large, 
the variance of this estimate of k-1 is equal to the right-hand side of (3-11) divided by #, and 
its correlation with 7 is zero, in samples from the same distribution. 

Sometimes it is reasonable to suppose that all the k; are equal. We then desire an efficient 
estimate of the common value. A method that suggests itself is to choose a value of & such 
that the sum (or a weighted sum) of s? — 7; — 7k vanishes. The expected value of this expres- 


sion, when k is replaced by the true k, is O( Nj), and it can be reduced to O(N;?) if we use 
instead 


i(k) = of (F,-+74/h) (1— [NF ). (52) 
If weights w, are used, our equation for k becomes 
& witd(h) => 0. (5-3) 
i= 
Treating k —k as infinitesimal and working to the first order for N; large, we find easily 
«  Yw var (t,) ‘ 
var (k) ~ (Swot, ok)?” (5-4) 
N var (t,) ~ 2k(k + 1) p(1+p,)*, (5:5) 
Ct, /0k ~ pi. (5:6) 
var (k) is a minimum when the weights w; are taken to be proportional to 
1 Oo . N; 
var(i)ak? “°“° tp 
N,-1 
We therefore choose wv, = ——. (5-7) 
(7,+k)? 








The ni 
calcul: 


the e2 


nd 
ns 
is 
sts 


ial 





F. J. ANSCOMBE 375 
The numerator N;—1 is more convenient than N;, since (N;— 1) 8? is found in the course of 
calculating s?. Thus the method is to choose k to satisfy =T(k) = 0, where 
N= Wat (N= 1) Fe + FYB) (5-8) 
(7, +k)? 
It is easy to verify that 7;(4) has expected value O(N;-"). The method is equivalent to taking 
an appropriately weighted average of the estimates (5-1); and, to the first order for N; large, 


Tb) = ' 





the variance of k is the reciprocal of the sum of reciprocals of the variances (3-11) for each 

sample. In samples from a single population, the correlation between 7'(k) and 7 is O(N-). 
We can deal similarly with estimates of k based on Method 2. Corresponding to (5-2) we 

may consider 





den), 


7m (5-9) 
2(7, + k) 


aN tT. ~k 
u(k) = Ng — (: + “) (N 
k 


the expected value of which is O(N’). The optimum weight factor w, (for N; large) is 


—In(1—X,)-—X; 
Wp prt Le a 1 
” (1—X,)* ((1— X,)-*— 1-&X,)’ a 
or rather an estimate of this. Since there is little to be gained by exactitude in weight factors, 
we may prefer to take instead 


w, = In(1+#,/h), (5-11) 
which has roughly the same effect and is easier to calculate. If (5-10) is used, we find that, 
to the first order for N; large, the variance of k is the reciprocal of the sum of reciprocals of 
the variances (3-13) for each sample. To test whether k changes progressively with m, we 


may plot U; = w,u,; against 7, and look for a correlation. In samples from a single population, 
the correlation between U and 7 is O(N-).* 


6. FITTING THE LOGARITHMIC SERIES DISTRIBUTION 


Fisher’s logarithmic series distribution was proposed as a model for the relative abundance 
of different species found in trap catches or other methods of sampling; in particular, to 
describe the relation between the numbers of moths of different species caught in a light-trap 
over a period of time. Suppose there are N species that might be observed, and that their 
abundances (numbers expected to be caught in a unit period of time) are distributed as if 
they were a sample of size N from a Type III distribution, proportional to a x* with 2k 
degrees of freedom. We suppose further that the individuals of each species move indepen- 
dently, so that the number of individuals of any species caught has a Poisson distribution 
with mean value equal to the abundance multiplied by the length of time of observation. 
The numbers of individuals caught per species then form a random sample of size N from the 
negative binomial distribution (1-1). The results of the observation can be expressed by the 
numbers n,; of species represented by i individuals for all i>0. We define S and J as in §1, 
so that S is the number of species represented by at least one individual, and J is the total 


* In my 1949 paper there is an oversight on this point. Having noticed a correlation between U, and 
7, in some data, I added (p. 172): ‘The effect is too marked to be attributed to the negative correlation 
between n, and 7 that occurs in repeated sampling of the same population.’ This is literally true, but mis- 
leading, since the relevant correlation is that between U and 7, not n, and 7, and that is O(N"). 











376 Negative binomial and logarithmic series distributions 


number of individuals of all species observed. The probability distribution of n; for +> 1, 
given N, k and X, is 


! Ne ne 
aa sith ql — xy en A =) ( ed a *)) Dae A (6-1) 
(V8)! Tm! 





If we set Nk = a, and consider the limit k>0, N oo, with « constant, we find easily that 
the above breaks up into a product of Poisson frequency functions for n, (i > 1), as indicated 
at (1-8). If the time of exposure or attractive power of the trap were multiplied by a factor c, 
without the abundances of the species being affected, it follows from the above assumptions 
that p would be changed to cp, and therefore X to cp/(cp +1), while « would be unaltered. 
a is thus a property of the biological association that is being examined, and has been termed 
by C. B. Williams the ‘index of diversity’ of the association. 
The log-likelihood function of the observations is 


L =aln(1—X)+8lna+IInX-— ¥ {n,nr+inn,}}. (6-2) 
r=1 


Thus S and J are jointly sufficient for estimating « and X, and the maximum-likelihood 


equations are ss 4 
I= @X/(1-X) = &, 
|(1—X) = ap (6-3) 


§ = —&ln(1—X) = &ln(1+9). 


On inverting the matrix of expectations of second derivatives of L, to find the variances of 
these estimates in the usual way, we get 


[—In(1—X)—X]var(X) ~ ~=X(1-X}'in(1—X), 
[—In(1—X)—X]oov(X,&)~ —X(1—X), -*) 
[-In(l1—X)—X]var(@) ~a. 


These formulae would certainly be correct asymptotically if the right-hand sides were 
divided by v and we were considering pooled estimates from v completely independent 
samples, with v tending to infinity. But we are actually concerned with one sample. Let us 
see in what sense, if any, (6-4) can still be regarded as correct. If a and p are both large, 
differentiation of any derivative of L by p changes its order of magnitude by a factor 1/p, 
and differentiation by a changes its order by a factor 1/a. The second derivatives are 
effectively constant if in probability 


where da = &2—a, dp = P—p. Assuming (6-4) to be true, 


woes) 08) 


in probability. Hence formulae (6-4) are correct asymptotically if a—0o while p is constant 
or increases (or, more generally, has a positive lower bound). 

But this result is not entirely satisfactory, since « is a constant of the association being 
observed, and although in many of the examples cited by Fisher et al. (1943) and Williams 


we 


Fo 


Ww 


F. J. ANSCOMBE 377 


(1944) a is fairly large, there is no logical necessity for it to be so.* The only adjustable 
feature of the sample is the time of exposure (or the attractive power) of the trap, by 
increasing which p may be increased. It is therefore of interest to develop asymptotic 
formulae valid as poo with a constant. To the first order when p is large, the variance 
of & given at (6-4) is 


- a 
var (&) aT (6°5) 
while to the next order of approximation 
A a 
var (@) ~ eit . (6 6) 
Let us see whether in fact one or both of these is correct, as 00. 
From (6-3), & is determined by the equation 
S = &In (I + &)—In &}. (6-7) 


The distribution of S is Poisson, with mean aln(1+~); the distribution of J is negative 
binomial with mean ap and exponent a. Thus while the former approaches normality as 
p>, and has coefficient of variation tending to zero, the latter does not. The distribution 
of In (I +«) also does not approach normality as poo, but the asymptotic distribution is 
known (Anscombe, 1948), and we have 


E {In (J+a)} = np+y(a) +0O(p-), var {In (J +a)} = p’(a)+O(p-4), (6-8) 
where f is any quantity such that 0<#<1, 8<a. We write now 4S, éI, ln (I +«) for the 


differences between S, J, In(I +a), and their respective mean values. Then (6-7) can easily 
be shown to give, in probability as p 00, 


dS + af{ln « — r(x) —d In (I + &)} 
~ Inp— {1+Ina—y(a«)—dln (I +a)}+0(1)' 
Considering the first terms of numerator and denominator, we see at once that the distribution 
of & is asymptotically normal and that (6-5) is correct. To see whether (6-6) is also correct, 
we need to evaluate E{éS dln (J +«)} and E{(d8)?éln (I +«)}. The joint distribution of S 
and J has probability-generating function 


da 





(6-9) 


E (tSu/) = exp {a(t— 1) In (1+p)—atln(1+p—pu)}. (6-10) 
Writing a ={S—aln(1+p)}{aln(1+p)}-? and y=Ip-, 


we find, on expanding the characteristic function of x and y for p large and applying the 
Fourier inversion formula, the aa asymptotic continuous distribution of x and y: 


Jone “ng ite [1+ 7% = 7 aa 3) +ax(Iny— via))}+O( =) Jaca. (6-11) 


* The fact that a is not adjustable does not in itself bar the use of an asymptotic formula valid as 
&—> 0o, since in any case such asymptotic formulae are used as approximations. Even if the parameter 
concerned is adjustable, only one value is usually available for consideration, and not an infinite sequence 
of values. An example of confusion on this point is the criticism by Kendall (1948) of a limit situation 
considered by Jones (1948) in the theory of systematic sampling. Jones gives a formula for the error 
variance which is asymptotically correct as the population extent tends to infinity, with constant spacing 
between sample points. Kendall, remarking that the population extent is not in general adjustable, gives 
a formula asymptotically correct as the spacing between sample points tends to zero, with the population 
extent constant. As an approximation to the actual situation, Jones’s formula is the better (see Kendall, 
1948, equations (20) and (25)). 











378 Negative binomial and logarithmic series distributions 








Hence easily E {dS dn (I +a)} = ap’(a)+ peng: ee 
E{(68)? dln (J +a)} = O(1). rr 

Finally, from (6-9), E (a) = a) guts we t aad ; (6:13) 
var (4) : (6-14) 


as In p + ay’ (a) — 2(1+ ln a—y(a))+0(1)° 


On inserting the asymptotic expansions of y(a«) and y’(«) in powers of a—!, we obtain the 
right-hand side of (6-6) with remainder term in the denominator which is 0(1) for both p 
and « large. Thus (6-6) is not correct, to the order suggested, if p is large, unless @ is also large. 

The formulae for var(&) that we have just discussed differ from one another only in 
accuracy , they are all approximations to the same true value, which has not been found. 


An essentially different formula has been given by Fisher (Fisher et al. 1943). When expressed 


in a form similar to (6-6), it is ats 


var (a) ~ (in p — 1)?” 


(6-15) 
This formula is appropriate to a special type of comparison, namely, between estimates of 
« for the same biological association derived from similar nearby traps, where it may be 
assumed that the individual species have exactly the same abundances (or at least the same 
relative abundances), and the difference between the catches at any two traps arises solely 
from Poisson variation in the numbers caught of each species. In such a sampling process, 
the estimate of « given by (6-7) is substantially biased, since only one set of relative abund- 
ances is involved. If we consider the variation in the bias for all possible sets of relative 
abundances following the limiting Type III distribution assumed, the overall variance of & 
is increased from Fisher’s value to that already considered. The larger variance is appropriate 
to comparing the values of 2 from observations on different sorts of biological association, 
involving perhaps entirely different families of species, and also to comparing values of & 
from observations on the same sort of biological association observed at different seasons of 
the year or in different years, when, even if the families of species are the same, the relative 
abundances of the species are different and may be supposed in aggregate to constitute 
independent samples from the limiting Type III distribution. Fisher’s formula, in fact, is 
not likely to be often useful, since if we desire to test whether the abundances of the in- 
dividual species are the same (or in proportion) at a number of traps it will be correct in the 
first place to make direct comparisons of counts of individual species, by a x? contingency- 
table test, or by analysis of variance after making a square-root transformation. If it has 
been established that the relative abundances of the species are not the same at different 
traps, it is likely that they will differ sufficiently to appear in aggregate to be independent 
samples from the hypothetical parent Type III distribution. They may indeed be so different 
as to suggest quite different parent distributions, and a test of this point would be based on 
the total variance of @. 

For example, Williams gives figures for captures of Noctuidae during a period of three 
months in 1933 at two traps (Fisher e¢ al. 1943). One trap, on a roof-top, gave S = 58, 
I = 1856, & = 11-37, } = 163; the other, in a field a quarter of a mile away, gave S = 40, 
I = 929, 2 = 8-51, § = 109. Fisher’s formula (6-15) for the standard error of either estimate 
of « gives approximately 0-67, indicating a significant difference between them. Whether or 
not the relative abundances of the species differed at the two traps would be more efficiently 





ant 


SS 





F. J. ANSCOMBE 379 


tested by comparing counts of individuai species. Formula (6-6) gives for the standard error 
of either estimate of « approximately 1-59, against which the observed difference is not 
significant. A significant difference in the richness of the associations observed at the two 
traps might be demonstrated, perhaps, by showing that the forty species caught in the field 
trap had the same relative abundances (within the limits of Poisson variation in numbers 
caught) as in the roof trap, while the remaining species caught in the roof trap were signi- 
ficantly more abundant, relatively to the others, than in the field trap (where the catches 
were zero). Apart from some such argument based on comparing catches of individual species, 
we cannot conclude from the figures for S and J alone that the biological associations observed 
at the two traps differed in diversity index «. 


7. TESTS FOR DEPARTURE FROM THE LOGARITHMIC SERLES FORM OF DISTRIBUTION 


Numerous alternatives to the logarithmic series distribution suggest themselves. Fisher has 
considered the negative binomial form (6-1) with k>0 and X, k and N unknown. We can 
obtain other three-parameter distributions by replacing the Type III distribution of species 
abundances by any other distribution of a non-negative random variable. The situation is 
the same as for the heterogeneous Poisson sampling considered in § 2, except that m, is not 
observed and N is an unknown parameter requiring estimation. 

Fisher gives a test for departure from the logarithmic form of distribution towards that 
at (6-1) with k > 0. The appropriate statistic, in addition to S and J, is 


1 l 
J= = Sn(1+5+5+-. +5 =). (71) 
of which the expected value is $a|In (1 + p)]?. Thus J — S?/(2@) may be taken as a test criterion, 
and its sampling variance can be investigated by the methods already indicated. If we are 
content with a first-order asymptotic result when «->00, we may consider the matrix of 
expectations of second derivatives of the logarithm of the likelihood function (6-1), the 
differentiations being with respect to X, «, and k. Setting k = 0, we obtain 


War (1-35) ~ ol tet — In ( a «a Al X){-—In(1—X)— X}}] 











7.9 
3a —In(i—X)- » oo 
- 4 1 \Xr 
where o(X) = E(l+atxt Ss ie (7-3) 
When »p is large o(X)~Alnp—B, (7-4) 
where, in terms of the Riemann ¢-function, 

A = (2) = 16449, B = 2¢(3) = 2-4041; (7-5) 

and hence, ignoring a factor 1+ O(p-), as in (6-6), 

y2 3 —“ 
1st (1-53 a) a[15(In p)3 (In p — 4) + (A In p—B) (Inp 1) (7-6) 
2a ‘Inp-1 


Fisher denotes the right-hand side of (7-2) by ¢ and has-tabulated it. 

Applications of the distribution made by Williams suggest that another sort of departure 
from the logarithmic series form may be worth investigating. In a complex association it 
might happen that while certain components of the association exhibit logarithmic series 








380 Negative binomial and logarithmic series distributions 


distributions the whole does not, since the component distributions have different para- 
meters. Let «,, X;, p; relate to the ith component association (i = 1, 2,...,v), and consider 
a logarithmic series distribution with parameters a», Xo, Po, chosen to give the same expected 
numbers of species and individuals as in the whole association. Then 


%q ln (1+) = x%%,In(1 +P%); 


(7-7) 
%pPo = 2X Pe: 
Hence In (1+ Po) _ > of BE tP0 (7-8) 
Po i=1 Pi 
where W; = Xp; / > 25D 3, 
j=1 


so that In (1+ 5)/p» is a weighted mean of In(1+~,,)/p;. If E and E* denote respectively 
expectations for the actual distribution and for the fitted logarithmic series distribution, 


we have r-1 
Pi 
r(1+p,)"’ | (7-9) 


E (n,) bt 2 Ps 
i= 


tes yak Po" 

Brn) = | Saw) oy 
We therefore consider Y = p’-1(1+ p)~ as a function of Z=p-!In(1+p). When r = 1, we 
find d? Y/dZ? > 0 for all p, so that Y is a convex function of Z, and therefore E*(n,) < E (n,), 
with equality only if the p; are all equal. For r > 2, the sign of d? Y/dZ? depends on p and can 
be studied in detail for each r. It is not difficult to show that E* (n,) < E (n,) for a range of 
small values of r and also for large values of r, while the inequality is reversed in a middle 
range (assuming the p, not all equal). If all the p; are large and not very unequal, the values 
of r where the inequality changes are roughly 


Po 
21n po 





and 27. 


The appropriate statistic for detecting a small degree of inequality in the p, is easily seen 


from the likelihood function to be > n,r(r — 1), of which the expected value is ap? when there 
r=2 


is no heterogeneity, i.e. when p; = p for alli and > a; = a. Hence we may take 





i=1 
& > n,r(r— 1) 
r=2 

per rrr ager ro (7-10) 

as test criterion. It is easy to show that, asymptotically for large a, 
1% oe 1 
: If, ! 

or, when is large also, var (W)~ = [2 = meni |: (7°12) 


The test is a limiting form of Test 2 of § 4. 


= . ee 
— or et tl 











or 


n 
f 
e 
$ 


\w 


Nao 





F. J. ANSCOMBE 381 


Several distributions of classification of species into genera given by Williams (1944) seem 
to show this kind of heterogeneity. Thus for Coccidae of the world classified by MacGillivray 
(Williams’s Table 11), S = 352, J = 1763, & = 132-2, f = 13-34, W = 2-76, s.e.(W) = 0-11. 
The observed number of monotypic genera n, is 181, which is much above its expected value, 
123-0, on the basis of the logarithmic series distribution. From r = 2 to r = 25, roughly, 
the n, are on the whole less than expected, while for higher r they appear again to be greater 
than expected. 

W is closely related to the characteristic K of Yule (1944). Thus 


Kw 10,0005 ” 10,000" =) (7-13) 


It is easy to show that, asymptotically for large a, 
E(K) = 10,000a-1,  var(K) = 2x 108[a3X?]-1, (7-14) 


if the observations are drawn from a logarithmic series distribution. Simpson (1949) has 
shown that, asymptotically for large p, but with « not assumed large, 


E(K) = 10,000(« + 1)-2. (7-15) 


As a statistic for estimating a, K is of low efficiency. Its proper function is to test distribution 
shape. 


I owe my interest in this subject to stimulating conversations with Dr C. B. Williams. 
Several persons have offered helpful suggestions and criticisms, in particular Mr D.G. Kendall, 
Mr J. G. Skellam and my colleagues at Cambridge, Dr J. Wishart, Dr H. E. Daniels and 
Mr D. V. Lindley. The work was begun at Rothamsted Experimental Station, and some 
help with the computations on which the figure is based was given by Mr B. M. Church. 


REFERENCES 


Aitken, A. C. (1939). Statistical Mathematics. Edinburgh: Oliver and Boyd. 
ANSCOMBE, F. J. (1948). Biometrika, 35, 246. 

ANSCOMBE, F. J. (1949). Biometrics, 5, 165. 

AnscoMBE, F. J. (1950). Ann. Appl. Biol. 37, 286. 

BEALL, G. (1940). Ecology, 21, 460. 

Buxton, P. A. (1940). Parasitology, 32, 296. 

Crrnuscul, F. & CasTAGNETTO, L. (1946). Ann. Math. Statist. 17, 53. 
FELLER, W. (1943). Ann. Math. Statist. 14, 389. 

Finney, D. J. (1941). J. R. Statist. Soc. Suppl. 7, 155. 

FisHErR, R. A. (1931). British Association Mathematical Tables, 1, xxvi. 
FisHer, R. A. (1941). Ann. Hugen., Lond., 11, 182. 

FisHer, R. A., Corset, A. 8S. & WrixiaMs, C. B. (1943). J. Anim. Ecol. 12, 42. 
GappvuM, J. H. (1945). Nature, Lond., 156, 463. 

GREENWOOD, M. & Yuu, G. U. (1920). J. R. Statist. Soc. 83, 255. 
Hatpang, J. B.S. (1941). Ann. Eugen., Lond., 11, 179. 

Hatpang, J. B. S. (1945). Biometrika, 33, 222. 

Harpang, J. B. 8. (1949). J. R. Statist. Soc. B, 11, 1. 

Irwin, J. O. (1941). J. R. Statist. Soc. Suppl. 7, 101. 

Jones, A. E. (1948). Biometrika, 35, 283. 

KENDALL, D. G. (1949). J. R. Statist. Soc. B, 11, 230. 

Kenpatt, M. G. (1943). The Advanced Theory of Statistics, 1. London: Griffin. 
KeEnpat., M. G. (1948). Biometrika, 35, 291. 





382 





Negative binomial and logarithmic series distributions 


Ltvprrs, R. (1934). Biometrika, 26, 108. 


MoKeEnpricx, A. G. (1914). Proc. Lond. Math. Soc. 13, 401. 


NEWBOLD, E. M. (1927). J. R. Statist. Soc. 90, 487. 
NeEyman, J. (1939). Ann. Math. Statist. 10, 35. 
Pétya, G. (1930). Ann. Inst. Poincaré, 1, 117. 
PRESTON, F. W. (1948). Ecology, 29, 254. 
QUENOUILLE, M. H. (1949). Biometrics, 5, 162. 
SHenton, L. R. (1949). Biometrika, 36, 450. 
Smmpson, E. H. (1949). Nature, Lond., 163, 688. 
Tuomas, M. (1949). Biometrika, 36, 18. 

Wits, C. B. (1937). Ann. Appl. Biol. 24, 404. 
Wriuiams, C. B. (1944). J. Ecol. 32, 1. 

WisHaRrrt, J. (1947). J. Inst. Actu. Stud. Soc. 6, 140. 
Youg, G. U. (1910). J. R. Statist. Soc. 73, 26. 


Yue, G. U. (1944). The Statistical Study of Literary Vocabulary. Cambridge University Press. 














will | 


and : 


then 





— 


— 





[ 383 ] 


ON QUESTIONS RAISED BY THE COMBINATION OF TESTS 
BASED ON DISCONTINUOUS DISTRIBUTIONS 


By E. 8S. PEARSON 


1. INTRODUCTION 


In statistical practice it often happens that we wish to combine the results of a number of 
independent experiments which have all been planned to test a common hypothesis. Thus, 
for example, several experiments comparing two treatments may have been carried out, but 
owing to differences in error variance or to other changes in conditions between experiments, 
it is not possible to pool all the data together. The overall test calls therefore for the com- 
bination of a number of independent tests of significance. As was first pointed out by Fisher 
(1932, §21-1), when dealing with continuous variables an overall test may be obtained very 
simply by an application of the probability integral transformation, which may be defined 
in the following general terms. 

Let p(x) be the probability density function of a continuous random variable x in the 
interval a< <b, where p(x) = 0 for x<a or x>b. Then if we write 


y= i) “a(e) de, (1) 


y is uniformly distributed in the interval (0,1) and z = —2log,y is distributed as x? with 
vy = 2 degrees of freedom, that is to say, 


p(2) = ge-*. (2) 
Similarly l-y = [pte dx (3) 


will have the same distribution. Further, if MV is the median value of z, such that 
M b 
| p(x) dx -| p(x) dx = 0-5 
a M 


and y’ is defined as follows: y’ = 2| "p(e) de if «<M, 


* (4) 
= 2 pla) dz if x«>M, 


then y’ is also uniformly distributed in (0, 1) and — 2log, y’ will have the x? distribution with 


v= 2, Ifnow2,(i = 1, 2,...,&) are k independent random variables with probability density 
functions p,(x), a; <x, <6, and if y, and y; are defined as in (1) and (4) above, it follows that: 


k k k 
(i) Q=—2% logy, (ii) Q2 = ~2z log(1—y;,), (iii) Q3 = igh logeyi, (5) 


are each distributed as x? with v = 2k degrees of freedom. 

The application of these results to the combination of independent tests of significance is 
straightforward. x, will be one of k independent test statistics, p(x, | Hy) will be its probability 
density distribution if the hypothesis tested is true and 


(1) y% or l-y, (2) ¥% 








384 Tests based on discontinuous distributions 


will be used according to whether it is appropriate to take the component tests in (1) | 
asymmetrical (single-tailed) or (2) symmetrical (double-tailed) form. 

This comprehensive test is not, however, immediately applicable if the distribution of z is 
discontinuous. The matter was recently discussed by Lancaster (1949), who pointed out the 
importance of having such a test available in a situation often met where the relevant 
probability distributions are either binomial, Poisson or (for the case of 2 x 2 tables) hyper- 
geometric and where the number of observations in any single group may be very small. 
Lancaster proposed for the discontinuous case two modifications of the standard test, 
leading to statistics whose sampling distributions were approximately those of x”. The effect 
of discontinuity has also been considered in greater detail by David & Johnson (1950). The 
work of these writers was aimed at providing more accurate approximations, in terms of the | 
probability integral of a continuous variable, of what is still a discontinuous distribution. 
A much more radical method of attack, however, follows from a simple extension of the 
procedure discussed by Eudey (1949), Stevens (1950) and Tocher (1950) in recent papers. 
This consists in the simple if unconventional device of adding by a separate ‘random 
experiment’ a continuous variable (say w) to the original discrete variable (X) and so obtaining 
a continuous variable (x) to which the methods for the continuous variables outlined above | 
may be accurately applied. | 

The possibility of this conversion has been recognized by statisticians for a number of 
years; it was recently raised and discussed in a paper read by Anscombe to the Royal 
Statistical Society (1948). But Eudey, Stevens and Tocher, in the papers referred to, have 
investigated from different angles, and rather more fully than before, the consequences of 
applying such a procedure. Its adoption would be accompanied by very great advantages, 
particularly in the case of interval estimation and in the problem of combining independent 
tests of significance. There are, however, a number of objections to its use, which many 
statisticians would regard as decisive. The object of the present paper is, first, to illustrate 
the method and its connexion with Lancaster’s proposal and, afterwards, to discuss the 
objections and their relation to theories of probability. 


2. OUTLINE OF THE CONVERSION DEVICE 


Let X be a discontinuous random variable which can assume values 0,1,...,X,... with 
probabilities f(0), f(1), ..., f(X), .... Write 


x 
F(X) = Zs (t). (6) 
Let u be a continuous random variable, independent of X, and uniformly distributed in 
(0, 1). Write a= X+u. (7) 


Then, using x), X_ and u, to denote particular observed values of the variables, 
Pr {a <a} = Pr{X < Xo} + upf(Xo) = F(Xo— 1) + upf(Xp). 
If we now take y(X,u) = F(X —1)+uf(X), (8) | 
it will follow that y is a continuous random variable uniformly distributed in (0,1). In 
practice, X is the random variable whose value is determined by the experiment proper, 


while u is determined by an ‘auxiliary experiment’ which may be most easily performed by | 
selecting a number from a table of random digits. 


The 
possil 


are gi 


Thus 


In 
X,w 
prob: 
obse: 
Ga § 
limit 





E. 8. PEARSON 385 


The following is an illustration of the procedure. The probabilities associated with the 
possible partitions within the 2 x 2 table 





are given by the hypergeometric function 











5!5!6!4! 
= ¥1(6—X)!(5—-X)1(X—1)lor (9) 
Thus we have the following table: 
x S(X) F(X) 
1 #2 = 0-0238 0-0238 
2 #r = 0-2381 0-2619 
3 32 = 0-4762 0-7381 
4 vx = 0-2381 0-9762 
5 zz = 0-0238 1-0000 

















In Fig. 1 the five discrete ordinates ending in small circles represent the distribution of 
X, while that of x = X + wis represented by the probability histogram in which the discrete 
probability value at X is spread out uniformly over the interval (X, X +1). Suppose the 
observed value of X is 2 and that the random selection of a two-figure number is 82; then 
x = 2-82. In the type of application discussed by Stevens (1950), where fiducial or confidence 
limits for an unknown parameter are required, the value of x will be what is directly needed. 


























05;- 6 
O4- 
K 03} 
— 
x) 
2 q 
a 
LION 
1 2 3 4 5 6 
Scale of X (or x) 


Fig. 1 


When dealing with statistical tests we need, however, the probability integral of x, or the 
y(X,w) of equation (8). This is most easily obtained in the following manner. For the 
example given above, with X = 2, it is seen that 
F(X —1) = F(1) = 0-0238, F(X) + F(2) = 0-2619. 
In a series of four-digit, random numbers, any number between 0238 and 2618 is equally 
likely to occur. If then we run down a column of these numbers and select the first one 
occurring in the range 0238-2618, this, when divided by 10,000, may be taken as y(2, w), 
Biometrika 37 25 











386 Tests based on discontinuous distributions 


For example, if the random number is 2186, then y(2, uw) = 0-2186 and is proportional to the 
shaded area shown on the left of the histogram in Fig. 1. w, if needed, could be found from 
0-2186 — 0-0238 
~ 02381 
It will be noted that if the double act of sampling is repeated, i.e. of selecting X in accordance 
with the law (9) and then determining u by the random number process, there will be an 
equal probability of y assuming any one of the 10,000 numbers 0000-9999. 
It follows that if the hypothesis specifying the probability law f(X) is correct, 
z= —2log,y(X, u) 
will be distributed as x? with 2 degrees of freedom, and, further, that a series of independent 
tests may be combined as indicated in §1 above. 


= 0-818. 





3. COMPARISON WITH LANCASTER’S TESTS 


For a given result of the experiment, i-e. for fixed X, the distribution of z = — 2 log, y(X, uw) 
for variation in the random element u in (0,1) will be that of a truncated section of an 
exponential curve. Thus the minimum and maximum values for z, attained when w = 1 and 
0, respectively, are given by 
Min.z = —2log, F(X), Max.z = —2log, F(X — 1). (10) 
Lancaster (1949), although not considering the introduction of w, suggested the use of two 
statistics which are, in fact, (2) the expectation or mean value, and (b) the median value 
of 2, Since y(X,u) = F(X—1)+uf(X) = e-*, (11) 
we obtain the following results : 
(a) Mean z or Lancaster’s ¥2,: 


z =| zdw = 2-—{F(X) log, F(X) — F(X — 1) log, F(X — 1)}/f(X). (12) 
0 
(6) Median z or Lancaster's x; : 
Median z = —2log,}{F(X)+F(X—1)}, if F(X-—1i)+0, (13) 
= 2-2log, F(X), if F(X-1)=0. (13a) 


The definition (13a) in the case F(X —1) = 0 was introduced by Lancaster so that 
x2 = x2, when X is the extreme observation. 

The total variance of z, which equals 4, is of course made up of the variance of z = x2, and 
the variance of z about 2 for fixed X. It follows that while the expectation of Z is 2, the 
variance is bound to be somewhat less than 4. The expectation of median z = x;? is a little 
less than 2, and its variance, again, is less than 4. Lancaster gave some numerical examples, 
and concluded that unless the number of permissible, discrete values for X was very small 
indeed, both y?, and x;? might be regarded as distributed (under the null hypothesis) as 
a x? with 2 degrees of freedom. Independent tests, each yielding a value of x2, (or y;2), could 
therefore, he suggested, be combined by summing these values as outlined in §1 above. The 
use of x2, rather than y?,, would, he thought, be preferred in practice’ because it could be 
more easily calculated. It should be noted that yj? corresponds to —2log,w in David 
& Johnson’s notation (1950, p. 42). 

The examples given in §4 below will illustrate, in particular cases, the relation between 


these different quantities. A more critical discussion of their use and interpretation will be 
reserved for §5. 





ex] 
col 





E. 8. PEarson 387 


4, ILLUSTRATIVE EXAMPLES 


The methods of analysis which have been suggested will be compared on three examples. 
In the present section, only the calculations will be presented, the interpretation and 
critical comment on the results being left to §5. 

Example I. The data were obtained in course of an investigation in progress at the Safety 
in Mines Research and Testing Branch of the Ministry of Fuel and Power, concerning the 
relative safety of mining explosives. Table 1 shows the result of six separate experiments 
comparing two explosives, A and B, under a number of different conditions. The cartridges 
of explosive are fixed in a gas chamber and a record is made of whether each shot results 
in an ignition (£) or non-ignition (not-Z). The weight of cartridge and the conditions of 
the explosion were varied in the different comparisons, so that it is not possible to pool the 
results. The problem is to consider whether, taken as a whole, there is evidence that 


Table 1. Safety in Mines data 











Explosive A Explosive B 
Comparison 
Ignitions Total shots Ignitions Total shots 
1 8 10 5 5 
2 2 10 13 15 
3 + 5 5 5 
4 1 5 1 10 
5 1 5 0 10 
6 0 10 4 6 























explosive A is safer than B under the conditions of the experiments.* The result of each 
comparative trial may be represented in a 2 x 2 table; for example, for comparison 1, we have 





E Not-H# | Total 
A; eat 2 10 
B 5 0 5 
13 2 | 16 


On the hypothesis of no difference, the conditional distribution of X (no. of ignitions for 
explosive A) for fixed margins will be of hypergeometric form, and the F(X) of equation (6) 
will be the lower tail sum of this series. 

The problem is one which seems to require a comprehensive test of the null hypothesis, 
sensitive to detect a trend in the direction of A being safer than B. With these very small 
frequencies conventional methods will give rise to inaccuracies which are difficult to, assess, 
and it is here that the introduction of the random u-element provides a solution which, if it 
were otherwise acceptable, is mathematically correct. 

The results of the analysis are summarized in Table 2. Cols. (2) and (3) give the tail sums, 
calculated from the appropriate hypergeometric series; col. (4) shows the random number, 


* The rather fuller data actually collected at the station made it possible to subdivide the results 
into three series of comparisons within each of which only the weight of charge varied. It was then 
possible to apply probit analysis technique. 

25-2 











388 Tests based on discontinuous distributions 


lying between F(X — 1) and F(X), determined as described on p. 385 above.* Col. (5) gives 
the transformed variable z, which on the null hypothesis will be distributed exactly as 
x? with 2 degrees of freedom. Cols. (6) and (7) give Lancaster’s ‘mean’ and ‘median’ value 
x determined from equations (12) and (13). Cols. (8) and (9) show the extreme values that 
z would have assumed had the random number drawn given wu as 1 or 0. 


Table 2. Analysis of data in Example | 











Med. 
Comparison | F(X—-1) | F(X) y(X,u) |z=—2log,y|, z=x2, | z=x2 Min.z | Max.z 
(1) (2) (3) (4) (5) (6) (7) (8) (9) 
1 0 0-4286 0-0051 10-56 3-69 3-69 1-69 fore) 
2 0-0005 0-00149 0-00107 13-68 14-80 14-34 13-02 19-97 
3 0 0-5000 0-0907 4-80 3-39 3°39 1-39 foe) 
4 0-4286 0-9048 0-8804 0-25 0-86 0-81 0-20 1-69 
5 0-6667 1-0000 0-9846 0-03 0-38 0-36 0-00 0-81 
6 0 0-00820 0-00490 10-64 11-61 11-61 9-61 foe) 
Total 39-96 34-73 34-20 25-91 (oe) 
P(x*) (v= 12) <0-001 | <0-001 | <0-001 | 0-011 0 



































At the bottom of the table are given the sums of cols. (5)—(9), and the probability, P(x), 
of obtaining a value of x? as large or larger than this sum value, fur vy = 12 degrees of freedom. 
Points which may be noted are: 

(1) In three comparisons the observed X falls at the lowest term of the appropriate 
hypergeometric series; hence F(X —1) = 0. The luck of the draw might therefore make 
y zero or, at any rate, exceedingly small. The upper limit for z has therefore been denoted as 
co for these cases. 


(2) Median z is slightly less than z, except in the cases where F(X —1) = 0, when they 
are equal. 

(3) The comprehensive test, whether applied exactly to the sum of the six values of 
z = —2log.y(X,u) or approximately to the sum of Lancaster’s x?’s, establishes significance 
at the 1 in 1000 level, and even supposing the exceedingly unlikely result that each z had its 
minimum possible value, the total x? would have been 25-91, and so at the 1 % level. 

(4) The largest contributions to the total y? come from the comparisons 2 and 6. 


Example II. The data are the result of an investigation, described by Rothschild (1949), 
into the effect of small electric currents on the fertilizing capacity of bull semen, used for 
artificial insemination of heifers. After collection, each sample of semen was divided into two 
equal portions; one portion had an electric current passed through it, following a method 
suggested for the measurement of sperm activity; the other portion was not so treated. The 
treated and untreated portions were then used to inseminate heifers; effectiveness was 
measured by the number of heifers which did not become pregnant, as judged by their being 
returned by the owners for re-insemination. In Table 3, this is the meaning of the column 

* The seven numbers were determined by opening Kendall & Babington Smith’s (1939) Table of 


Random Sampling Numbers at random and running down a column to find the first number, treated 
as a decimal fraction, lying between F(X —1) and F(X). 





transi 
this, « 











E. S. Pearson 389 


headed ‘Returns’; a significantly greater proportion of returns among heifers for which the 
treated semen had been used would imply that the electrical test procedure was harmful. 
As the samples, leading to the results in Table 3, might be heterogeneous it was considered 
that the figures should not be pooled by merely adding the columns. In the paper quoted, 
Rothschild used a method of analysis suggested by R. A. Fisher involving the angular 
transformation for binomial variables. There is, of course, some approximation involved in 
this, and it is of interest to apply the same methods as have been used in Example I above. 


Table 3. Rothschild data 












































Inseminations with Inseminations with 
untreated semen treated semen 
Sample 
Returns Total Returns Total 

1 5 ll 1 9 

2 3 g 4 9 

3 3 12 3 5 

4 6 ll 3 9 

5 4 10 0 5 

6 3 ll 1 5 

7 1 4 3 6 

Table 4. First analysis of data in Example II 
Med. 
Sample F(X —1) F(X) y(X,u) |z=—2log,y] z= x2, sxe Min. z Max. z 
(1) (2) (3) (4) (5) (6) (7) (8) (9) 

1 0-8808 0-9881 0-9867 0-03 0-14 0-14 0-02 0-25 
2 0-1674 0-5000 0-4099 1-78 2-28 2-20 1-39 3°57 
3 0-0276 0-2054 0-1031 4-54 4-54 4°30 3°17 7-18 
4 0-6890 0-9201 0-7802 0-50 0-46 0-44 0-17 0-74 
5 0-8462 1-0000 0-9906 0-02 0-16 0-16 0-00 0-33 
6 0°3654 0-8187 0-4871 1-44 1-10 1-05 0-40 2-01 
7 0-0714 0-4524 0-2511 2-76 2-89 2-68 1-59 5-28 
Total | 11-07 11-57 10-97 6-74 | 19-36 
P(x?) (v= 14) 0-68 0-64 0-69 0-94 0-15 



































With the figures as arranged, X will denote the number of returns in the second column of 
the table and we want a comprehensive test of the null hypothesis, sensitive to detect a trend 
in the direction of the treatment causing an increased proportion of returns. Table 4 shows 
a similar analysis to Table 2, F(X) being the lower tail-sum of the appropriate hyper- 
geometric series. We note that: 

(1) In this case the luck of the draw has actually placed £(z) between (z) and 
x (median 2z). ‘ 

(2) The comprehensive test gives no suggestion of significance, whichever criterion is used. 


This would have been the conclusion even in the unlikely event of &(z) attaining its maximum 
value. 











390 | Tests based on discontinuous distributions 


(3) The test used by Rothschild leads to a standardized normal deviate é, large positive 
values of which would have indicated that the electrical treatment was harmful. He found 
that £ = — 1-22,* so that the P(g), comparable to the P(x?) of the present tests, equalled 0-89. 

It is clear, therefore, that there is no evidence, when the results are considered as a whole, 
that the treatment reduces the fertilizing capacity of the semen. A casual inspection of the 
figures in Table 3 does, however, suggest that there might be some heterogeneity present in 
the sense that in some cases it is the treated and in others the untreated material that appears 
to give considerably the better result. This point may be investigated by using the double- 
tailed or symmetrical test based on the y’ defined in equation (4) above. Thus 


y'(X,u) =2y(X,u) if y(X,u)<0-5 
=2(1—y(X,u)) if y(X,u)>05, (14) 

and z= —2log,y'(X, u). 

Similarly, for Lancaster’s median z, we should take 
X2 = —2log, {F(X) + F(X — 1)}, if {F(X)+ F(X —1)}< 0-5 
= —2log,{2—F(X)—F(X-1)}, if {F(X)+F(X—-1)}>0-5. 

The result of applying this analysis is shown in Table 5 for z and median z. For the former, 
the same random values of u have been taken as in Table 4. In this case, owing to the chance 
association of rather large values of u with the large values of X in samples | and 5, the total 
x? for z is considerably larger than for median z. Significance could not be claimed, even with 
the x? of 21-83, but the example brings out the point which must be faced in the discussion 


that the ‘luck of the draw’ for the u-values is bound sometimes to put X(z) beyond a 
significance level not exceeded by either of Lancaster’s statistics, and vice versa. 


(15) 


Table 5. Second analysis of data in Example IT 




















Sample y'(X, u) z= —2log, y’ Med. z=x;? | 
1 0-0266 7°25 4-06 
2 0-8198 0-40 0-81 
3 0-2062 3-16 2-91 
4 0-4396 1-64 1-88 
5 0-0188 7-95 3°74 
6 0-9742 0-05 | 0-41 
7 0-5022 1:38 | 1-29 
| Total 21-83 15-10 
| P(x?) (v= 14) 0-083 0-37 














Example III. In this illustration the distribution of X on the null hypothesis follows the 
binomial and not the hypergeometric series. The problem is of a type which may often arise 
in comparing frequencies of rare events. The data have been taken for purposes of illustration 
from those collected in a much fuller investigation into traffic accidents undertaken by the 
Road Research Laboratory (described by Manning, 1949). In Table 6, cols. (3) and (4) show, 
respectively, the number of cyclists (a,) and motor-cyclists (a,) involved in accidents leading 
to personal injury on sections* of five main roads near London during the period June 


* The figure is given in his paper as + 1-22, but the sign appears to be in error. 





sitive 
ound 
0-89. 
hole, 
f the 
nt in 
pears 
uble- 


(14) 


(15) 


‘mer, 
ance 
total 
with 
ssion 
nd a 


s the 
arise 
ation 
y the 
how, 
ding 
June 





E. S. PEarson 391 


1946 to May 1947. Since the vehicle mileage of these two types of road-user will have been 
different, the figures are not comparable as they stand. Cols. (6) and (7) give rough estimates, 
M, and M,, in millions of miles, of the appropriate total vehicle mileage during the year on 
the sections of the roads considered. If vehicles are involved in accidents in a random 
manner, it might be expected that a, and a, would be Poisson variables with expectations 
m, and mg. If that is the case we might ask the following question: Allowing for the difference 
in vehicle mileage (but not for differences in speed governing the vehicle-hours on the road), 
is there evidence from these data that the risk of accident is less for a pedal-cyclist than for 


a motor-cyclist ? 
Table 6. Road Research data 























| | Estimated vehicle 
Vehicles involved in accidents mileage in millions 
Comparison Road . % 
Mot | Moto — 
Cycles — ae Cycles — 
(a,) cycles T=A,+, (M,) cycles 
(a) (M;) 
(1) (2) (3) (4) (5) (6) (7) (8) 
1 R, 5 4 9 1-69 0-80 0-68 
2 R, 2 2 4 1-80 0-50 0-78 
3 R, 3 1 4 1-82 0-33 0-85 
4 R, + 3 4 1-79 0-57 0-76 
5 R, 3 5 | 8 1-56 0-54 0-74 























Formally, the problem is that of testing the hypothesis that m,/m, = M,/M,, making the 
test sensitive to alternatives m,/m, < M,/M,. It is known} that if a, and a, are independent 
Poisson variables, then on the null hypothesis and within the conditional set of samples for 
which a,+a, = r is constant, a, follows a binomial (1—A+A)’, where A = M,/(M,+,). In 
our general theory we may put a, = X, so that f(X) is a binomial term. By adding the 
random element u we shall have the y(X, ~) of equation (8) uniformly distributed in (0, 1), 
that is to say, it has a distribution freed from dependence on the restriction a, +a, = r = con- 
stant. The comprehensive test is obtained as before by summing z = — 2 log, y(X, u) for each 
comparison. 

The results of the analysis are shown in Table 7. F(X) and F(X — 1) are the lower tail-sums 
of the binomials (1 —-A +A)’ with X = a, and using the values of A given in col. (8) of Table 6. 
It will be seen that there is relatively little difference between X(z), &(Z) and & (Med. z) and 
that, taken as a whole, we should conclude that the risk of accident per vehicle mile was less 
for a cyclist than for a motor-cyclist. This result is amply confirmed by the fuller data. It 
will be noticed, however, that the result would have been inconclusive had we confined our 
attention to the first four sections of road; the relevant figures are shown at the bottom of 
the table. : 

It should be emphasized that the application of this method to these particular data is 
perhaps not altogether justifiable, partly because the estimates of the vehicle mileage, M, 








* Between 6 and 15 miles long. 
+ See, for example, Pearson (1948) for a discussion of the test and references, Also, for another 
solution, see Haldane (1948). 








392 Tests based on discontinuous distributions 


and M,, were very rough, but also because the frequencies a, and a, may not have been 
independent, if more than one vehicle was involvéd in the same accident. However, data of 
similar type where a, and a, are independent and small, and where A = m,/(m,+mz,) can be 
fairly closely estimated, are of quite common occurrence. If the A’s are not the same for all 
comparisons, then the overall test proposed, if acceptable, avoids all complications arising 
from the discrete character of the original Poisson distributions. 


Table 7. Analysis of data in Example III 














Med. 
Comparison | F(X—1)| F(X) y(X,u) |z=—2log,y| z= x2 armors Min. z Max. z 
(2) (3) (4) (5) (6) (7) (8) (9) 

1 0-1252 | 0-3173 | 0-2413 2-84 3-09 3-02 2-30 4-16 

2 0-0356 | 0-2122 | 0-1003 4-60 4-38 4-18 3-10 6-67 

3 01095 | 0-4780 | 0-2983 2-42 2-61 2-45 1-48 4-42 

4 0-0617 | 0-2231 | 0-1447 3-87 4-02 3-90 3-00 5-57 

5 0-00172 | 0-01224 | 0-00397 11-06 10-17 9-91 8-81 12-73 

Total, 1-5 24-79 24-27 23-46 18-69 33-55 
P(x?) (v= 10) 0-006 0-007 0-009 0-045 |<0-001 

Total, 1-4 13-73 | -:14-10 13-55 9-88 20-82 
P(x?) (v=8) 0-090 0-080 0-095 0-27 0-008 



































5. Discussion 


(5-1) The case, for and against 
Before examining obvious objections to the adoption of statistical tests involving the 
introduction of this additional random element, it may be well to summarize what is gained 
by their use. 

(a) In the first place we have secured for discontinuous distributions one objective which 
is commonly regarded as desirable in a test of significance. If we follow the rule of procedure 
laid down, we know precisely the chance of rejecting the null hypothesis when it is true; in 
other words, we have an exact significance level and do not have to be content with using an 
upper bound to this risk. 

(5) This result may hardly be considered necessary in the case of a single comparison, 
where a knowledge of the numerical values of F(X — 1) and F(X) should provide all that the 
statistician needs on purely probability grounds to make up his mind on whether to reject 
the null hypothesis or not. But as soon as we wish to draw a broad conclusion from a number 
of independent tests, the advantage of an overall test criterion having a simply determined, 
continuous probability integral becomes apparent. 

(c) While it is true that the use of Lancaster’s statistics or of angular or other similar 
transformations provides variables whose distributions can be fairly closely represented by 
known, continuous distributions, there must always be some doubt of the ‘closeness’ when 
dealing with small frequencies. To illustrate this I have calculated for the combination of 
samples 5, 6 and 7 of Example II, the possible values that Z,, Z, and Z, (i.e. the x2, values) 





mig 
obs 


exc 
Th 


tall 








E. S. PEarson 393 


might assume, keeping the marginal totals of the corresponding 2x 2 tables fixed at the 
observed values as follows: 











Sample 5 Sample 6 Sample 7 
2a te gale ad ae ' ee 4 
—<— 5 ae 5 si” cea 6 
fo ae Te 4 12 | 16 4 6 | 10 


Assigning to each value of X its appropriate hypergeometric probability, it was possible 
to calculate the chance that S =2,4+%,+2, 


exceeds (a) the 5 % and (b) the 1 % level of significance for a y? having 6 degrees of freedom. 
The result was as follows: 








5% 1% 
x? significance level 12-592 16-812 
No. of values of S exceeding this level 59 34 
Chance of S exceeding this level 0-0429 0-0089 

















The differences between 0-05 and 0-0429, 0-01 and 0-0089 are in this case small, but as long 
as x?,, or any other criterion used, can assume only discrete values, the element of uncertainty 
will be present. 

(d) Since the distribution of X(z) = X{ — 2 log, y(X, w)} is the same within every conditional 
set, e.g. whatever be the marginal values of the 2 x 2 tables, differences in point of view as to 
the reference set to which the test should be related, which have led to some controversy, 
become irrelevant. 

(e) In the case of a single comparison, Tocher (1950) has shown that if we require a test 
region of fixed ‘size’ then, in spite of the introduction of the random element, the test is more 
powerful in the sense of Neyman and Pearson than any other test. The extent to which 
the comprehensive test also possesses optimum properties of this kind has not so far been 
explored. 

(f) Itis sometimes necessary to compare the efficiency of two statistical tests, one or both 
of which depend on a statistic having a discontinuous distribution. Difficulties then arise 
in adjusting the critical regions to be of the same size, e.g. so that both correspond to a 5% 
level of significance. The introduction of wu would avoid this difficulty. 

(g) If the values of F(X —1) and F(X) have to be calculated, and this may be necessary 
in any analysis of small frequencies, y(X, wv) is found immediately if random number tables 
are available. The overall test is therefore as quick to carry out as that based on Lancaster’s 
x2 and is a good deal quicker than that based on x?,. 

However, even when these advantages have been considered, it is impossible to deny that 
the introduction of the random element strikes a discordant note in statistical practice, not 
only because it is unfamiliar but because it appears to offend some common-sense principle. 
Although in the examples given above it has turned out that the conclusions based on z are 
the same as those based on y?, or x;?, it is clear that this will sometimes not be the case; 
expressed crudely, the idea that a decision should depend on what could be described as 
a toss-up made after the true experimental results are available, is certainly difficult to 
accept. Yet it is doubtful whether we can dismiss straight away this type of procedure as 








394 Tests based on discontinuous distributions 


impossible, for the implications run deeper than at first sight appears. There is a conflict of 
attitudes of mind which is not easily solved. As M. G. Kendall wrote in a recent article on 
‘Reconciliation of theories of probability’ (1949, p. 103): ‘We are concerned not only with 
the relationship of theory with the external world but also with the relationship between 
our calculus and the way we think.’ 

The two main attitudes held to-day towards the theory of probability both result from an 
attempt to define the probability number scale so that it may readily be put in gear with 
common processes of rational thought. For one school, the degree of confidence in a pro- 
position, a quantity varying with the nature and extent of the evidence, provides the basic 
notion to which the numerical scale should be adjusted. The other school notes how in 
ordinary life a knowledge of the relative frequency >f occurrence of a particular class of 
events in a series of repetitions has again and again an influence on conduct; it therefore 
suggests that it is through its link with relative frequency that a numerical probability 
measure has the most direct meaning for the human mind. 

An intriguing characteristic of the problem posed by the introduction of this ‘last- 
minute’ random element is that it shows that most of us, whether we regard ourselves as 
‘frequentists’ or not, to use Kendall’s term, are in fact influenced by the primitive idea on 
which Jeffreys’s theory is based. If we are at first repelled by the suggestion of using this 
device itis, think, because we feel instinctively that having completed the experiment proper, 
the relevant information on which to reach a rational conclusion must be available without an 
appeal to any list of random numbers. But the recognition of this fact does not, of course, 
necessarily imply acceptance of the view that degree of belief can be represented on a 
numerical scale. 

Elsewhere in this issue of the journal, Barnard (1950, p. 207) has suggested that there 
should be a difference between the theory to be used in planning an experiment in advance 
and that required in drawing inferences after the results are known. The first is the theory 
of probability, related closely to relative frequency, the second is the theory of likelihood. 
But it seems difficult to accept this solution; for if the planning is based on the consequences 
that will result from following a rule of statistical procedure, e.g. is based on a study of the 
power function of a test and then, having obtained our results, we do not follow the first rule 
but another, based on likelihoods, what is the meaning of the planning? 


(5-2) Introduction of the random element into the problem of interval estimation 


It may be helpful at this point to consider the proposal made by Eudey (1949) and Stevens 
(1950), in its special application to the binomial. If X is the number of individuals bearing a 
character A in a sample of size n drawn randomly from a much larger population in which a 
proportion p possess A and g = 1—pdo not, then f(X) will be a term in the expansion of the 
binomial (q + p)”. The problem of calculating from X a confidence or fiducial interval for p was 
first attacked by Clopper & Pearson (1934), who provided some charts from which the interval 
could be roughly calculated. Later Stevens (Fisher & Yates, 1942, Table VIII‘) provided 
an alternative solution with tables. In both cases it was only possible to associate with the 
intervals the lower bound of the probability that the interval would include the unknown 
value of p, i.e. the upper bound of the risk of non-inclusion or error. Thus, as Stevens points 
out, the implication remained that the limits provided were unnecessarily wide and might 
in some way be narrowed until the stipulated risk was reached. This result can be achieved 
by using the continuous variable x = X +. in place of X and then, on replacing the upper 


E. S. PEARSON 395 


bound of the risk by an exact risk of error, the interval is invariably narrowed. In Fig. 2 
are plotted the boundaries of the confidence belt for n = 10 and risk of error 0-20 (or con- 
fidence coefficient 0-80) tabled by Stevens (1950, p. 126). For a given p the two limiting values 
for x = X + wv are those obtained by regarding the binomial as a probability histogram as in 
Fig. 1, and then cutting off the 0-10 tail areas, by dividing the blocks as required in pre- 
portional parts. Using the current method based on X only, we make use of the discrete 
points indicated by small circles.* 

1-0 





0-9-- 

















pi —>F- 7 
ot} : 
’ ' ‘ 
' u ' 
0-0 i f Ss Se Ll lL lL st i lL 
0 1 2 3 4 5 6 7 8 9 10 11 


Scale of x or X 
Fig. 2 





As an example, suppose three individuals in a sample of ten possess a character A. By 
the current method the confidence limits p, and p, are given by the circled points on the 
ordinate at X = 3. Thus we can make the statement 

0-116 <p < 0-552 
with a probability of at least 0-80 of being correct. Using Stevens’s procedure we select 
a random number in (0, 1); suppose it is 0-74, then 2 = 3-74 and we find from the bounding 
curves that p; = 0-161, p, = 0-534. Hence the statement becomes 

0-161 < p< 0-534, 
and the associated confidence coefficient is exactly 0-80. The interval is reduced from 0-436 
to 0-373, and both limits have been pulled in. 

It seems true to say that those who have accepted and understood the principle of con- 
fidence interval estimation have regarded the method as a useful technique whose adoption 
means no more and no less than this: that in long-run use of the method in statistical practice 


* For convenience, in Clopper & Pearson’s charts (1934), these points were joined by continuous 
curves, whose ordinates could, however, only be used at integral values of the abscissa. 

It should be noted that Eudey (1949), who derives a shortest unbiased confidence interval in 
Neyman’s sense, gives a confidence belt which is rather differently placed with regard to the Clopper- 
Pearson points (see his Fig. 13). 











396 Tests based on discontinuous distributions 


(whether in binornial or other problems) the interval will include the unknown parameter 
in about the expected number of cases.* If it was previously accepted as a technique with 
this property, its essential character is not altered by the addition of Stevens’s random 
element, and since, as a result, the interval is narrower and the risk of error made more 
precise, it would appear unreasonable to refuse the improvement if it were made available 
in suitable tables. On the confidence interval approach (though not on the fiducial) 
alternative estimators giving different intervals for the same data may be used. Thus it is 
accepted that statistician A, using range or mean range for rapidity in calculation to estimate 
o, will get a different and, on the average, rather wider confidence interval for an unknown 
mean than statistician B, using standard deviation. It would therefore not appear anomalous 
that in the binomial problem using different random numbers wu, A and B will find different 
intervals for p. 

It appears, indeed, to be the case that in problems of estimation, where generally all that 
we ask for is a broad measure of reliability and where any decision, if it is to follow, is not 
closely linked with the result of a test, the method proposed will not meet with strong 
objection. It is where the choice of a random number has to be followed at once in the 
statistical procedure by a verdict on significance that the conflict between the frequentist 
and non-frequentist outlook becomes apparent. 


(5-3) Other tests involving last-moment randomization 


Before a final summing up, it is of interest to note the existence of certain other statistical 
procedures where the introduction of randomization, not essential to the conduct of the 
experiment, has been suggested. 

Geary (1935), in examining possible tests for normality of a univariate frequency distribu- 
tion, suggested (p. 316) a statistic which may be written as follows: 





n —1 { ] jn—1 J 

Wi= Uy i \@- 1) / x v'\. 

i-1t i=1 

where (on the hypothesis tested) y; (i = 1,2,...,n—1) are n—1 independent normally dis- 
tributed linear functions of the n observations z,;. He also provided some tables of the pro- 


bability integral of w,_, for small values of n. The y; were not symmetrical functions of the 
x;, for example, 


Yy = (%1—%q)//2, Yo = (4, +X_—2a5)/./6, etc., 

so that the z’s had to be ordered in a random manner before the y’s could be calculated. Thus 
Ww, would assume different values for the n! possible permutations of the x,. If the observa- 
tions were collected in some natural random order, this could be used in calculating y;. 
Otherwise, a process of randomization was required. 

Scheffé (1943) has proposed a test of the hypothesis that the means are equal in two 
normal populations from which two independent samples of size n, and n, > n,, respectively, 
have been drawn, where no assumption is made of equal population variances. This test 
requires a random pairing of n, of the observations from the second sample with the 7, of 
the first sample. Again, if the observations are not recorded in a random order, randomization 
is needed before the pairing is carried out, and the numerical value of the test statistic will 
depend on the outcome of this process. 


* Or, for the discontinuous case in rather more than the nominal expected number of cases. 


— 


SO 





eter 
vith 
lom 
ore 
uble 
‘ial) 
it is 
ate 
wn 
ous 
ent 


hat 
not 
ong 
the 
tist 


ical 
the 


bu- 


lis- 


~~ 





E. 8. Pearson 397 


The range or difference between extreme observations in a sample has been used as a means 
of estimating o. For a given total number of observations, N, the accuracy of estimation may 
often be improved considerably if the observations are broken up into a number, m, of equal 
groups each containing n (say 5-10) observations and a mean range is calculated. Recently, 
Lord (1947, 1950) has suggested a form of modified t-test, in which the root mean-square 
estimate of o is replaced by a range estimate. If the number of observations exceeds, say 
N = 10, the efficiency of the test, i.e. its power to detect differences in mean values, may be 
considerably increased by breaking the observations from which o is to be estimated into 
groups and calculating a mean range. But this subdivision must be a random one. If, from 
the nature of the problem, we are confident that the observations come to hand in a random 
order, then we can find the range of the Ist , the 2nd and so on, take an average and use 
Lord’s test. But if we are not sure that the variations among the observations are inde- 
pendent of order, we must introduce some random procedure on purpose to group them. 
Thus the estimate of o and hence the value of the test statistic will depend on this final 
randomization. 

Here then we are faced with a dilemma. Is it justifiable to use Lord’s test when there is an 
obvious grouping of the observations inherent in the form of the data, but not so when 
artificial randomization is necessary? Or, if the test is legitimate in both cases, what is the 
essential difference between using randomization here and in the tests suggested earlier in 
this paper? Alternatively, are all tests based on mean range to be condemned? One may note 
that in the range test some little effort is involved in randomly grouping the observations 
and finding the mean range, so that we do not readily have before us a number of alternative 
values of the test criterion. In the case of the binomial or hypergeometric we can get as many 
values of uw, and therefore of = X +, as we please by merely looking further down the 
column of random numbers. This suggests a levity in approaching a decision on significance 
which offends our sense of scientific propriety ! But the difference between a more concealed 
and more blatant form of tossing up may vanish when viewed in perspective. 


(5:4) Conclusion 


This paper has raised more questions than it has answered. ‘The way we think’ is so much 
a personal matter that it would be presumptuous to claim one right way of regarding the 
randomization procedure described in the earlier sections of the paper. Just as views have 
changed with time on the propriety of introducing a random element which would not 
otherwise be there, into the conduct of an experiment, so it is possible that with time the 
instinctive objection to using a random number after the experiment proper is completed may 
disappear. Cases will always arise from time to time where it is evident that the luck of the 
draw has made the u elements in the z’s pull against the X elements, so that a result becomes 
significant that we judge should not, or vice versa. But it is also true that a random draw 
made before the experiment is sometimes found to lead to an arrangement, e.g. of plots, which 
it is realized will be likely, if accepted, to render that particular experiment inconclusive or 
biased. No doubt, in such cases, practical common sense prevails and chance is invoked 
a second time. This probiem is always present when artificial randomization is employed, and 
it was with this difficulty in mind that W. S. Gosset in his last statistical paper (‘Student’, 
1937) advocated the greater use of balanced in place of randomized designs. 

The long-run verdict on this form of comprehensive test for discontinuous variables will 
depend on many factcrs, among which utility is likely to carry much weight. The test is often 








398 Tests based on discontinuous distributions 


required as a rough foot rule where more detailed consideration might involve weighting the 
different series. Thus, for a small number of series, Lancaster’s suggestion of adding the 


values of x2 = Med. z = —2log, F(X) + F(X—1)} 


may meet all needs. Once the F(X) have been found, this test is quick to apply and is 
conservative, in that it is rather less likely to claim significance than the nominal probability 
level suggests. On the other hand, if many series are to be combined, the risk of accumulation 
in the bias and the subnormal variation of x7; may swing the choice to X {— 2 log, y(X, u)}. 
The present paper has included a number of examples which make possible some study of 
the behaviour of the test based on y(X,w); but the reader may well prefer to suspend 
judgement until he has had opportunity of making his own comparisons. 

Finally, one point seems clear; to condemn the procedure out of hand may involve, for 
consistency, ruling out of court a number of other techniques which have been or are being 
accepted into current statistical practice. 


The author is indebted to Mr J. W. Gibson of the Safety in Mines Research and Testing 
Branch of the Ministry of Fuel and Power and to Dr F. Garwood of the Road Research 
Laboratory for supplying him with the original data used in Tables 1 and 6 respectively. 


REFERENCES 


ANSCOMBE, F. U. (1948). J. R. Statist. Soc. A, 109, 181. 

BaRNaRD, G. A. (1950). Biometrika, 37, 203. 

Cropper, C. J. & Pearson, E. 8. (1934). Biometrika, 26, 404. 

Davip, F. N. & Jonnson, N. L. (1950). Biometrika, 37, 42. 

Evupey, M. W. (1949). Technical Report No. 13, Statistical Laboratory, University of California. 

FisHER, R. A. (1932). Statistical Methods for Research Workers, 4th ed. London and Edinburgh: Oliver 
and Boyd. 

FisHer, R. A. & Yates, F. (1942). Statistical Tables for Biological, Agricultural and Medical Research, 
2nd ed. London and Edinburgh: Oliver and Boyd. 

Geary, R. C. (1935). Biometrika, 27, 310. 

Hawpangz, J. B. S. (1948). Biometrika, 35, 297. 

KENDALL, M. G. (1949). Biometrika, 36, 1@1. 

Kenpatt, M. G. & Basrneton Smita, B. (1939). Tracts for Computers, no. 24. Cambridge University 
Press. 

Lancaster, H. O. (1949). Biometrika, 36, 370. 

Lorp, E. (1947). Biometrika, 34, 41. 

Lorp, E. (1950). Biometrika, 37, 64. 

Mannina, J. R. (1949). Unpublished Report. 

Pearson, E. 8S. (1948). Biometrika, 35, 301. 

RorHscHiLD, Lorp (1949). J. Agric. Sci. 39, 294. 

Scurrrsé, H. (1943). Ann. Math. Statist. 14, 1. 

Stevens, W. L. (1950). Biometrika, 37, 117. 

‘Sruprent’ (W. 8. Gossget) (1937). Biometrika, 29, 363. 

Tocuer, K. D. (1950). Biometrika, 37, 130. 





If 2 
tio! 
sur 


the 


g the 
z the 


nd is 
bility 
ation 
', U)}. 
ly of 
pend 
>, for 


eing 


sting 
arch 
ly. 


liver 


arch, 


rsity 


—_—_—~ 


[ 399 | 


SIGNIFICANCE OF DIFFERENCE BETWEEN THE MEANS 
OF TWO NON-NORMAL SAMPLES 


By A. K. GAYEN, St Catharine’s College, University of Cambridge 


1. INTRODUCTION 


Ifz’ and Z” are the means of two samples of n, and n, members drawn from the same popula- 
tion (or from two different populations having equal variance), and if 8; and S3 are the sample 
sums of squares about the mean, then the test function 


um /(Bar-”) (’ — 2") 
N+ Ne (S34 83) 


has the ‘Student’ ¢-distribution (Fisher, 1925) with (n, + n,.— 2) degrees of freedom provided 
the variables are normally distributed. But the situation will be different for non-normal 
variation in the parent distributions. On the basis of some experimental results, E. S. 
Pearson (1931) concluded that ‘in dealing with very small samples (say, ifn, + n, < 20), the use 
of the t-probability (normal-theory t) scale rather than the normal probability* scale is justi- 
fied, even if the population varies very considerably from the normal’. Examining the pos- 





sible form of the joint characteristic function of Se" - > x” and ye" + ¥ x"? for the popu- 
lation expressed by the first three terms of the Gram-Charlier series, M. 8. Bartlett (1935) 
observed that in the particular case where n, = no, the effect of parental skewness on the 
normal-theory distribution of u is very small. R. C. Geary has recently (1947) deduced the 
first four cumulants of uw, on the assumption that the sampled populations have zero mean but 
different values of cumulants A; for 7 > 2. Utilizing these cumulants in the first few terms of a 
derived expression (his formula (2-24)) for the frequency density of ‘Student’s’ ¢ in the case of 
a unique non-normal sample, he has shown by a few illustrative examples (in which, however, 
only the negative tails of the distributions have been taken into consideration) that ‘the 
actual probability could be considerably at variance with that shown in the standard table 
for small samples’. 

The purpose of the present investigation is to obtain the mathematical form of the fre- 
quency function of u for any size of moderately non-normal samples. The parent populations 
characterized by a priori values of the cumulants A; have been considered to be represented 
by Edgeworth’s generalized law of error where all terms containing population cumulants up 
to the fourth order are included. If the higher cumulants of the populations other than those 
considered are negligibly small, then the derived law is sufficiently accurate for any size of 
sample. The same law, on the other hand, holds asymptotically in samples from any popula- 
tion with finite cumulants. Thus it is not unlikely that the distribution obtained has quite 
an extended range of applicability for moderate size of samples. The expressions for the 
corrective tail areas are derived. The magnitudes of the effective corrections depend on the 
population A’s, but even where exact values of them are not available, we may sometimes 
safeguard against error by considering the corrections for their plausible values. 


* Referring to some ‘normal curves’ which Pearson examined for approximating the actual frequency 
of u (see E. S. Pearson, 1931, p. 123). 











400 Significance of difference between the means of two non-normal samples 


2. JOINT DISTRIBUTION OF THE DIFFERENCE BETWEEN TWO MEANS 
AND THE POOLED SUM OF SQUARES WITHIN SAMPLES 


We shall consider the variables to have zero mean and unit variance without assuming that 
the values of A, (= ,/7,) and A, (= f,—3) in the universes from which the samples have been 
derived are necessarily equal. Accordingly, let x}, 23, ...,7,, and x}, %},...,2,, be two samples 
drawn respectively from the two different populations, (Aj, Ay) and (Aj, Aq), expressed by the 
first four terms (up to A3) of the Edgeworth form of Type A series, viz. 


fle) = G(ae) — 28 goon) + 8 gare) 4. 28 ox) (2-1) 
3! 4! tz ; 


where $(x) is the standardized normal function and ¢(zx) its rth derivative. The mathe- 
matical model thus set up will enable us to consider the question whether the samples have 
been conceivably drawn from populations alike in their mean (variances supposed the same), 
although they may differ in the degree of non-normality. The appropriate formulae in the 
case of sampling from the same population will follow as a particular case from the more 
general results reached by such a specification. 


If we write S = 2'-Z%" =Sj/n,—Sj/no, 
™ Na 
Q = Xe -2 P+ d (@"-2")? = 8, +83, (2*2) 
1 I 


then, following the method of M. S. Bartlett (1935), the characteristic function of the dis- 
tribution of S and Q may be obtained in the form 





7 : 
M(t,,t,) = Ef{exp (it, S + it,Q)} = anh) 1+ 5i(*(j es “ay sit) +cg(it)h 
_ 3d, | 


! ’ °, \2 
+ rT {ast + 6d,(it,)* + 3d, — a =a 


6 
(1 —2it Dit ay (a(t )* +ds3) + 


+75 aC) ®— Oat) + Mylit)— Oa4-+ Sars aul) — Boalt)? + 804) 


9 6 
tare ~ Bi, yi (alta)? — 94) + G9; hi a0} | (2-3) 








where Roath os Ve = 1, +N. —2, (2-4) 
a= ajay (4—), mire ot (25) 

d, = 4%, y= MMM, dy = “84424 Atma tMAS (2°6) 

and fi = (3-3). 93 = (A3—A3)?- (7+ wa) 4 2(2 +1 )asas | 
d= (HM) (Oa Dag) gg = mast tmadst)—B0ayt4 gy +2(E +2), 


(2:7) 








(i 


dis- 


ee 








A. K. GaAvEen 401 


ce ; 1 es : ; ‘ , 
Substituting (2-3) in aunt | exp {—it, S—it,Q} M(t, t,) dt,dt, and integrating term by 
term, the joint frequency distribution of S and Q will be found in the form 


f(8,Q) = Wor)| 1+ silat (gis) 9 a i(3)* FT a (| 


+ arlon (Ga) + ose Hal) +8 0(5 (Ge) +4) +8 
salt Sahil) stl 2) -meeGia( 2) -otal$) 
+0(m,( ¢)- 204) oa + Vas. ire: r »\ | (28) 


2 #(¥2—2) p49 
where W(v.) = {exp(-5 re +) Sraes (277). (2-9) 


H,(S/./v,) are the Hermitian polynomials in (S/,/v,) of degree v, and the c-, d- and g-coefficients 
are given by (2-5)—(2-7). 








3. THE FREQUENCY DISTRIBUTION OF u 


Next introducing the new variable u by the substitution S = | “4 Qiu in (2-8), and integrating 
2 


for Q between the limits 0 and oo, we obtain the frequency function of u as 

















flu) = 1 1 (M9) u* — 3(mg3) & 
W_ BUS, ¥%q) (1+ w/v, HAD * 375 {Br (vg + 2} (1 + w] 7g HOD 
+ (qo) w* — 6(m51) U* + 3(Mg9) (9) US — 15(m51) w4 + 45( M49) U? — 15(m35) (3-1) 
4! vf BS, 4(v_ + 4)) (1+ w?/y,)to+) 72v§ B(4, $(vg+4))(L+u2/r_)teet? — ’ 
where 
ng) = HAS +) TED (04 +2) + HAS— A) 
x {(%q—y)?/ 4/(M Ny) — 2 (My Mg) (2PQ + 1)/(V + 2)}, (3-2) 

(ma) = HAG + AG) TA (ny +2) + HAS AB) 


x {(Mq— 21)?! 4/(14 Mg) — 2 4/(my Ng) Vg/(¥_ + 2)}, 





(M049) = (AG +AG) {¥o(M2q— 01)?/(M, Ng) — 2V3/ (Vy + 3)} 
+3(AQ- we sou Ng —N) )?/(ny Mg) — 6V3/(V_+ +3)}, 


(Myx) = $(Ag + AG) {Yo(My— 03)?/ (my Mg) — 2V3/(V_ + 3)} 


+4$(Ag—A4) o- on {V_(M2—2,)?/(ny Mg) — 2g(¥_— 2)/(ve+3)}, f (33) 


aye, pian) 23 
(Maa) = H+ AO Tne (gD) matinee 
(n.—1;){(m_—M,)? v3 2v3(ve+4) 
+ HAG—Ad) (V, =H Mg (Ve+2)  (¥2+2)(¥2+3)’ J 
Biometrika 37 7 























402 Significance of difference between the means of two non-normal samples 


(Ng) = (AL +25)? yes 3)? (Ve— 4). 12p, 











Ny Np {V,+ 3) 
19 yo (Ma— My) [(M_—4)*(Vg—4)  4(v3— 40, + 6) 
+4058— 239 | ee 
+ yma { (Mg — 4)" ((Ve— 4) (Vg + 2)? (Vo + 3) — 120 NgQVQ(V_ + 1)) 
+A; -23)4{ — 2 2 + 2)8(ve43) 2° 2\"2 





4 2232 +1) (vg + 2)? + 2m, no(2v3 + 3V_+ “| 
(Ve + 1) (¥g + 2)? (vo +3) F 


(M51) = #(A3+A3)? eee 36v, 











5n,N» 5(v_+ 3) 
. ng, (Ne — 4) {(%_—,)? (5¥,—8) 4(3v2—4r,.+4+ 18) 
+ HAs? 25") ) Bi ’ \ Biv, + 3) 
+ yo {(%_— 1)? ((5¥_ — 8) (Vo + 3) (Vg + 2)? — 4, MQ(V_ + 1) (11¥, + 12)) 
+405-25)| Socal, 2 a (7,4 2)® (2 +3) 





4y,(9(V_ + 1) (Vg + 2)? + 2n, Ng V9(2V2— - 
= B(V> + 1) (¥» +2)" (Ye+3) , 











+. yarg { (Me — 24)? Vo(5, + 4) 12v3 
(na) = 20+ 15) Tt aes 
, (%2— 1) {(Mg— Ny)? VQ(5VQ+4) . _4¥Q(v3+2) 
+ HAs? As") (vg+2) | 5nyN9(v2+ 2) Pos eet 
{ (%q— 2)? (Vq(5V2 + 4) (Ve + 2)? (Vg+ 3) — 424 NgVQ(¥2 + 1) (TV + 16)) 
+ H(A3—As) later in I es : (+ 2)" (7543) ’ : 





4 4v2(3(v2+ 1) (vp + 2)? — 2n, n9(2v2 + 9vn+ a 
5(V_ + 1) (ve + 2)8 (ve + 3) 
+. yarg | (Me — Ny)? v2(5y, -- 16 12y3 
("s3) = HAs + ag a " as 2) (Vp +3) = 
+ #(As2 — Az?) (%_— 7) ear v3(5y, + 16) 4v3(v5 + 8y, + 18) 
9 (Vg +2) | 5m Mg(V_+ 2) (Vg +4) " 5(¥.+ 2) (v2 +3) (v2 +4) 
+hAS— a)e{ “= 1)? (V3(Vq + 2)? (Vg + 3) (5vg + 16) — 120, n,3(V2 + 1)(¥2+4)) 
5N4 MN» (Vo + 2)8 (vo +3) (v2 + 4) 
4V3(3(V2 + 1) (Va + 2)? + 2m, Mo(Vq + 4) (22+ al 
5(V_+ 1) (vg + 2)3 (vg + 3) (Vo + 4) 














> 





(3-4) 


The first four raw moments of wu, calculated directly from its frequency function (3-1) 
(neglecting expressions which contain powers of vz higher than the second) will be found to be 


1 : ‘ = 1 3 
wi(w)> {405-2 (7-3) +405 +0) GCs SY), 


2m,N, \vg 43 


1 2 6 1 
myuy>{(14 245) n+ ay ERD + age ay (= 1) 4 


2 
2, Ne Ve 








- on) 


+(A3—A3)? — + (Aj? —A3?)° 


wilu)= (405-29) (+ 2 ) #405440 (G- 4). 


ry 


(nt D2 . 
Ny Ng 





~(A;+ +e}, 


| 





> (3-5a) 


Mal? 


pt 


4) 


) 


e 





A. K. GayvEn 403 


pa(u) = sal (8+ tp) PF (As — AG) Aa) (ata (Ea _ 1a) _ Nia _ 96) , 











2n Ne MN, \Vo ve Ve v3 
> yey ((Mg— 1)? (i- 12? v2 15p? 
+420 2mm, \vg ve we ve vi ) 
+ smo 120, . 2v,(3r, +26)  (m.—n,)* 4 27 
a ERE cero hi eco 1 ke ‘ 
+(A3—As) (-+ + mind 48 i. lan, ) (3-55) 


P nay (NE — 3) ‘n}) (3 8ly,\  27(n.—,)? ( ae 
2 2 rk FE eee | oath tice 
+ ee NN Ve ve ) bl nin? 2V2 th ) 


6y} 27 (n2—n?)? 1 
4 2 SS SS See 
+ (Ag +A3) Ce i; ae al: 

The approximate expressions for the first four cumulants deduced from (3-5a, 6) were found 
to agree with the results obtained from R. C. Geary’s formula (2-28)* by the substitution of 
A, =A, = 1. 

In the particular case of samples of equal size, drawn from the same or different popula- 
tions, and those of unequal size from the same population, the distribution of u may be easily 
deduced from (3-1). The distribution in the case of equal-sized samples from two different 
populations is studied in some detail in the following section. 





4, SPECIAL CASE: EQUAL-SIZED SAMPLES 
In the important case where n, = n, = n, the frequency density of u will be given by 
I'}(v +1) 1 + HAZ—A’) 1 (3vu — (2v + 1) uS) 
(av) Pdr) (1+ u2/p)torD © NS 2" by {2a(v + 2)} (1+ u2/p)toro 
404 T}(v+5) (ut — 6u® + 3v/(v + 2)) 
. ») 30 +3) J(mv)THv+4) (1+ u®/v)ier 


2 
(us—out 4-9 7 y2 4 = ) 








p(u) = 




















, T'}(v +3) (v+ 2) (v+ 2) (v+4) 
+2(A3+ As)" Tia) Pv + 4) (1+ u®/pior 
ot he Ph(v+1) 
+4(Ag— As)? 144y (mv) T4(v + 4) 
(2v?+ 3y+2)v 3(2v2 + 15y + 10) v?) 
((202+ 99+ 10) u8— 3(2v8 + 17+ 18) wt — 9 —— ut + ) 
= (1+ u?/p)ier? 


= Po(u) + HAs — As) Da(w) — H(Ag + Ag) Pale) + (AZ + As)? Psu) + H(AZ—Az)?De(u), (41) 
where v = 2n—2. As p,(u) is the normal-theory frequency function of u, the remaining 
expressions p,(u) in (4-1) will be called the corrective factors due to population values of the 
A-coefficients appearing with them. It is interesting to notice that, to the order of approxi- 
mations considered, there is no corrective term in the difference of the kurtosis of the 
populations. 


* There appears to be a slip in the first term of the expression for A*Z, in Geary’s formula (2-28). It 
should have been (in his notation) 


Aa, As nage +n” =A (* *) (1 eee 
(3+ 142 en x instead of nite +2 anal) 








404 Significance of difference between the means of two non-normal samples 


—Us 
Let us consider the tail probabilities of the distribution of w. The integrals | p,(u) du 


—a 


and [patuy du are equal in magnitude but opposite in sign, the former being positive. For 
all other corrective factors the values of such integrals are identical. Considering the negative 
tail of the distribution, we write 
Puig) = | plu) du = Pug) + HAS—A8) Pte) — HAL+ XG) Pat) 
] + HAS +A5)* P(g) + HAS—AVPE Pelt), (4-2) 
where Pa(tp) = | pou) du = 4,,(4», 4), (4-3) 


: (1+ (2v-+ 1) u3/v) 
6 \{2m(v+2)} (1+ ug/r)ieor® 


= (eee ea)) a Y-(sFaRoTa) EHO +2. UD], Cesdis 


— We 1 ((v +2) ug — Svuy) ; 
Pato) =| “patuyau mr 12(v + 2)? vtB($(v + 2),4) (1+ way we 





Puy) = |“ ps(uydu = (4-4) 





“yp ' 5 bi 
= 8(v + 2) (v +3) {I,,(4¥, 3) ie 27,(3(V ¥ 2), 3) + L,,($(v +4), 3)}, (4-5 bis) 
We ra 1 . ((v+2)(v+ 4) uk — 4v(v + 4) ud — 3v2x,) 
__ Pau) au = 6(v + 2)2 (v +4) viB(A(v + 2), 4) (1 + uz/p)te+s) 





Puy) = [ 


(4°6) 

Vv = 
= i443) 045) {5I,,(4v, 5) — 91,,(4(v + 2), 3) + 32,,(4(v + 4), $) + 1,,(4(v + 6), 4)}, 
(4-6 bis) 





Puy) = | “palu) du = 


((2v + 5) (v +4) (v + 2)? up + 2(2v + 1) (v + 4) (v — 2) vug — 3(2v? + 15 + 10) v? a,) 











36(v + 2)*(v +4) vEB(dy, 4) (1 + ug/v)to+® 





(4:7) 
v 5(2v? + 9v + 10) I,,(3v, $) — 3(2v? + 1 7p + 18) J,,(4(v + 2), 3) 
- — 3(2v? + 3y + 2) I,,(4(v + 4), 3) 
24(v+ 1) (v+ 2) (v+3)(v +5) | + (202+ 16v+ 10) J,,(}(v +6), 3) 





| 
| 


(4:7 bis) 


where J,,(p,q) is the incomplete beta-function ratio with x, = v/(v+4u). The values of the 
above integrals at the lower 2} °, points of the normal-theory w, for v = 2, 4, 6, 8, 12, 20, 30, 40, 
60, 120 and « are given in Table 1. 

Obviously, the magnitude of the corrections to be applied to the normal-theory probability 
depends on the population A’s. But even where they are not known a priori we may sometimes 
safeguard against error by considering corrections for their plausible values. R. C. Geary 
(1947) has suggested the use of the estimated values of the A’s from the sample observations 
by applying R. A. Fisher’s corresponding k statistics. 


5. ASYMPTOTIC CHARACTER OF THE DERIVED DISTRIBUTION 
The frequency function (3-1) is sufficiently accurate for any size of sample provided the 
populations are moderately non-normal (in which case they are satisfactorily expressed by 
the terms of the Edgeworth series considered in (2-1)). R. C. Geary’s (1947) asymptotic 








lu 


or 


- = = NS 














A. K. GayEn 405 


formulae for the cumulants of u in samples from any populations (his formulae (2-28)) are 
free of terms in the fifth and higher order A’s of the population, which indicates that these 
higher population cumulants can occur only in terms in higher negative powers of n, the 
sample size. Accordingly, if the samples are large enough, the terms as far as those in the 
fourth and the square of the third order A’s of the series for p(w), (3-1), will afford a sufficiently 
close approximation to the actual frequency curve of u for any parent population. So for 
moderate sizes of samples the probability density p(w) may have quite an extended range 
of applicability. 


Table 1. Showing the probability corrections (for population skewness and kurtosis) of the test 
function u at the lower 24% points of its normal-theory values. The approximate true 
probability is given by the formula (4-2) 











Probability corrections 

v Up 
P3(Uo) P4(uo) P5(Uo) P (uo) 
2 4-3027 0-0149 0-0024 0-0094 0-0042 
+ 2-7764 0-0199 0-0024 0-0113 0-0027 
6 2-4469 0-0206 0-0019 0-0103 0-0014 
8 2-3060 0-0202 0-0015 0-0090 0-0007 
12 2-1788 0-0188 0-0010 0-0071 0-0002 
20 2-0860 0-0161 0-0006 0-0049 0-0000 
24 2-0639 0-0151 0-0005 0-0042 0-0000 
30 2-0423 0-0139 0-0003 0-0035 — 0-0000 
40 2-0211 0-0123 0-0003 0-0027 — 0-0000 
60. 2-0003 0-0104 0-0002 0-0019 — 0-0000 
120 1-9799 0-0075 0-0001 0-0010 0-0000 
ore) 1-9600 0-0000 0-0000 0-0000 0-0000 


























In cases of equal-sized samples, as we have seen, the frequency function of u undergoes 
considerable simplification. By the time v = n,+n,.—2 = 20, the probability corrections, 
excepting that for half the difference of third order A’s, are negligibly small; those for higher 
order cumulants must be still smaller, according to the asymptotic properties of p(w) noted 
above. The use of normal-theory probability in such cases may not lead to appreciable 
inaccuracy, provided always the populations are nearly symmetrical, or are approximately 
alike in the degree of skewness. The situation appears to remain more or less the same if 
the samples do not differ widely in size. 

In the case of a one-way classification for analysis of variance the distribution of the 
variance ratio (which, for our non-normal samples, is denoted by w) has been studied by the 
author (Gayen, 1950) in a paper printed on pp. 236-55 above, assuming samples of unequal 
size drawn from the same non-normal population. In the special case of two samples, where 
the distribution of w is equivalent to that of wu, the error introduced by the difference in 
sample size is in general found to be not very serious. ° 

For v = 12 (i.e. for samples of n = 7) the approximate true probabilities at uw = — 2-1788 
(which is the normal-theory 2} % point) have been tabulated (Tables 2-4) in a few cases of 











406 Significance of difference between the means of two non-normal samples 


Table 2. Showing the comparative values of the approximate true probabilities of u in equal- 
sized samples of 7 from two different skew mesokurtic populations 


Pw <ug(= —2-1788)) =| p(u)du = Py(uy) + HA5—A4) Plt) 
<: + HAS +5)? Palio) + HAS —A5)? Pyle) 








a 
—1+5 —1-0 —0°5 0-0 0-5 1-0 1-5 
a3 
—1-5 0-0255 0-0305 0-0364 0-0432 90-0509 0-0595 0-0691 
—1-0 0-0202 0-0252 0-0303 0-0362 0-0431 0-0509 0-0595 
—0°5 0-0176 0-0209 0-0251 0-0302 0-0362 0-0431 0-0509 
0-0 0-0150 0-0174 0-0208 0-0250 0-0255 0-0362 0-0432 
0-5 0-0134 0-0149 0-0174 0-0208 0-0251 0-0303 0-0364 
1-0 0-0126 0-0133 0-0149 0-0174 0-0209 0-0252 0-0305 
1-5 0-0128 0-0126 0-0134 0-0150 0-0176 0-0211 0-0255 
































Table 3. Showing the comparative values of the approximate true probabilities of u in samples 
of 7 from two different symmetrical populations 


P(u < Up (= — 2:1788)) =|™ pew du = Po(ug) — $(Ag + AG) Py(Uo) 








AG 
—2 -1 0 1 2 
a 
—2 0-0270 0-0265 0-0260 0-0255 0-0250 
-1 0-0265 0-0260 0-0255 0-0250 0-0245 
0 0-0260 0-0255 0-0250 0-0245 0-0240 
1 0-0255 0-0250 0-0245 0-0240 0-0235 
2 0-0250 0-0245 0-0240 0-0235 0-0230 


























Table 4. Showing the comparative values of the approximate true probabilities of u when two 
samples of 7 are drawn from the same universe 
U 


Pw < ug (= 21788) = [™ p(u)du = Py(ug)— Ag Puta) +A3 Plt) 


—@ 








G 
0-0 0-5 1-0 15 2-0 
Ay 

—2 0-0270 — — — —_ 

-1 0-0260 0-0261 0-0262 — — 
0 0-0250 0-0251 0-0252 0-0254 0-0255 
1 0-0240 0-0241 0-0243 0-0244 0-0245 
2 0-0230 0-0232 0-0233 0-0234 0-0235 









































~~ m— © Ww 


mal- 





two 











A. K. GayvENn 407 


populations having different values of A, and A,. The results show that the tail probabilities 
will be seriously affected if the differences of the skewness A, of the sampled populations are 
not small. In cases where the samples are taken from two symmetrical populations or from 
the same universe, the effects are not very important. It will be noticed, however, that for 
a leptokurtic symmetrical population the approximate true probabilities are always slightly 
less than the corresponding normal-theory values. The variance of u in such cases, as will be 
seen from its expression in (3-5a, b), is also less than its normal value provided the samples 
are not widely different in size. 

The results of Table 4 (based on n,+7, = 14 observations) tend to confirm Pearson’s 
conjecture (1931, p. 126) that ‘in dealing with very small samples (say if n,+n,< 20) the 
use of the t-probability scale (normal-theory t) rather than the norma] probability scale 
(normal curve with correct o, = /{v/(v—2)}) is justified, even if the population varies very 
considerably from the normal’. 

Example (8. D. Wicksell, 1917, pp. 36-9). The frequency distribution of weights of a 
new-born child has shown: for boys, A; = — 0-1320, Ay = 0-2856, and for girls, Ay; = — 0-2148, 
Aq = 90-6672. The standard deviations of the two distributions, being respectively 1-6836 
and 1-6701, may be regarded as very nearly the same. The populations have been considered 
by S. D. Wicksell (1917) to be well expressed by the Edgeworth series (2-1). Given two 
random samples of 5, namely, 


Sample I (boys—weights in grammes): 3321, 1818, 2662, 2342, 3786, 
Sample II (girls—weights in grammes): 3174, 3667, 3592, 3772, 4983, 
would it be justifiable to assume that their populations have the same mean? 

Here wu = —2-2732 for vy = 10—2 = 8 degrees of freedom. The approximate true pro- 
bability at 2-3060 (the normal-theory 5 % point in this case) may be calculated from formula 
(4-2) by using the results of Table 1. Thus, for the upper tail, 

P(w> 2-3060) = 0-0250— (0-0412) (0-0202) — (0-4764) (0-0015) 
+ (0-0301) (0-0007) + (0-0017) (0-0090) 

= 0-0235, 
and for the lower tail, P(u< —2-3060) = 0-0252. 
So that P(| w| > 2-3060) = 0-0487. The sample value of u, — 2-2732, is of course numerically 
less than — 2-3060, but we may calculate the actual probability in this case. We find 

P(w< —2-2732) = 0-0263 + (0-0412) (0-0209) — (0-4764) (0-0012) 
+ (0-0301) (0-0005) + (0-0017) (0-0070) 
= 0-0267, 

and P(u> 2+2732) = 0-0249. 
Accordingly P(| u | > 2-2732) = 0-0516. 


SUMMARY 


The mathematical form of the frequency distribution of the normal-theory test function ¢ 
(denoted here by w) used for testing the significance of the difference between two means has 
been derived for populations specified by Edgeworth’s form of Type A series. The distribution 
containing terms as far as those in the square and product of the third order population 
cumulants is: (i) sufficiently accurate for any size of moderately non-normal samples, and 











408 Significance of difference between the means of two non-normal samples 


(ii) valid asymptotically for any universe. Thus it is not unlikely that for moderate size of 
samples the formula has quite an extended range of applicability. In cases where samples 
of unequal size are drawn from two different populations, the deviation of the distribution 
of u from its normal-theory law may be considerable. For leptokurtic symmetrical popula- 
tions o2, is less than its corresponding normal-theory value, provided always the differences in 
the size of samples are not very large. For equal-sized samples from the same population 
the frequency curve of w may be closely approximated by the ‘Student’ normal-theory 
law, if the total sample size is not too small. 

Probability corrections at the 24% points of the normal-theory u have been tabulated 
(Table 1) for some representative cases of equal numbers of samples. The results of this table 
and the derived formula (4:2) enable us to examine the true probability of u for a priori 
values of the population A’s. Where these are not known we may sometimes safeguard against 
error by considering the corrections for their plausible values. An illustration is given to 
show the practical application of the derived results. 


In conclusion, I wish to acknowledge my indebtedness to Dr H. E. Daniels for his kind 
advice and criticism in the course of my investigations. 


REFERENCES 


Bartiett, M. 8S. (1935). Proc. Camb. Phil. Soc. 31, 223. 

FisHer, R. A. (1925). Metron,5,90. ° 

Gaven, A. K. (1950). Biometrika, 37, 236. 

Geary, R. C. (1947). Biometrika, 34, 209. 

Pearson, E. S. (1931). Biometrika, 23, 114. 

Wicxssett, 8. D. (1917). K. svenska VetenskAkad. Handl. 58, 3, 36. 





ee 
——— — 





_ > 


Q 


ind 


a 





[ 409 ] 


TESTING FOR SERIAL CORRELATION IN LEAST 
SQUARES REGRESSION. I 


By J. DURBIN anp G. 8. WATSON 
Department of Applied Economics, University of Cambridge 


A great deal of use has undoubtedly been made of least squares regression methods in 
circumstances in which they are known to be inapplicable. In particular, they have often 
been employed for the analysis of time series and similar data in which successive observa- 
tions are serially correlated. The resulting complications are well known and have recently 
been studied from the standpoint of the econometrician by Cochrane & Orcutt (1949). 
A basic assumption underlying the application of the least squares method is that the error 
terms in the regression model are independent. When this assumption—among others—is 
satisfied the procedure is valid whether or not the observations themselves are serially 
correlated. The problem of testing the errors for independence forms the subject of this 
paper and its successor. The present paper deals mainly with the theory on which the test 
is based, while the second paper describes the test procedures in detail and gives tables of 
bounds to the significance points of the test criterion adopted. We shall not be concerned in 
either paper with the question of what should be done if the test gives an unfavourable result. - 

Since the errors in any practical case will be unknown the test_must be based on the 
residuals from the calculated regression. Consequently the ordinary tests of independence 
cannot be used as they stand, since the residuals are necessarily correlated whether the 
errors are dependent or not. The mean and variance of an appropriate test statistic have been 
calculated by Moran (1950) for the case of regression on a single independent variable. The 
problem of constructing an exact test has been completely solved only in one special case. 
R. L. & T. W. Anderson (1950) have shown that for the case of regression on a short Fourier 
series the distribution of the circular serial correlation coefficient obtained by R. L. Anderson 
(1942) can be used to obtain exact significance points for the test criterion concerned. This 
is due to the coincidence of the regression vectors with the latent vectors of the circular serial 
covariance matrix. Perversely enough, this is the very case in which the test is least needed, 
since the least squares regression coefficients are best unbiased estimates even in the non-null 
case, and in addition estimates of their variance can be obtained which are at least 
asymptotically unbiased. 

The latent vector case is in fact the only one for which an elegant solution can be obtained. 
It does not seem possible to find exact significance points for any other case. Nevertheless, 
bounds to the significance points can be obtained, and in the second paper such bounds will 
be tabulated. The bounds we shall give are ‘best’ in two senses: first they can be attained 
(with regression vectors of a type that will be discussed later), and secondly, when they are 
attained the test criterion adopted is uniformly most powerful against suitable alternative 
hypotheses. It is hoped that these bounds will settle the question of significance one way or 
the other in many cases arising in practice. For doubtful cases there does not seem to be any 
completely satisfactory procedure. We shall, however, indicate some approximate methods 
which may be useful in certain circumstances. 











410 Testing for serial correlation in least squares regression. I 


The bounds are applicable to all cases in which the independent variables in the regression 
model can be regarded as ‘fixed’. They do not therefore apply to autoregressive schemes and 
similar models in which lagged values of the dependent variable occur as independent 
variables. 

A further slight limitation of the tables in the form in which we shall present them is that 
they apply directly only to regressions in which a constant term or mean has been fitted. 
They cannot therefore be used as they stand for testing the residuals from a regression 
through the origin. In order to carry out the test in such a case it will be necessary to calculate 
a regression which includes a fitted mean. Once the test has been carried out the mean can 
be eliminated by the usual methods for eliminating an independent variable from a regression 
equation (e.g. Fisher, 1946). 


Introduction to theoretical treatment 
Any single-equation regression model can be written in the form 
Y = PyX, + PoXgt... + pyX_e+eE 
in which y, the dependent variable, and x, the independent variable, are observed, the errors 


€ being unobserved. We usually require to estimate /,, f2,...,8;, and to make confidence 
statements about the estimates given only the sample 


Yy Uy Ugy vee Uy 
Yo Vo oq wee Ug 
Yn Lin Len ase Xin 


Estimates can be made by assuming the errors ¢,, €9, ...,€, associated with the sample to 
be random variables distributed with zero expectations independently of the 2’s. If the 
estimates we make are maximum likelihood estimates, and if our confidence statements are 
based on likelihood ratios, we can regard the 2’s as fixed in repeated sampling, that is, they 
can be treated as known constants even if they are in fact random variables. If in addition 
€1,€g,--.,€, can be taken to be distributed independently of each other with constant 
variance, then by Markoff’s theorem the least squares estimates of /,, fs, ..., 8, are best 
linear unbiased estimates whatever the form of distribution of the e’s. Unbiased estimates 
of the variances of the estimates can also be obtained without difficulty. These estimates 
of variance can then be used to make confidence statements by assuming the errors to be 
normally distributed. 

Thus the assumptions on which the validity of the least squares method is based are as 
follows: 

(a) The error is distributed independently of the independent variables with zero mean 
and constant variance; 

(6) Successive errors are distributed independently of one another. 

In what follows autoregressive schemes and stochastic difference equations will be 
excluded from further consideration, since assumption (a) does not hold in such cases. 
We shall be concerned only with assumption (6), that is, we shall assume that the x’s can be 
regarded as ‘fixed variables’. When (b) is violated the least squares procedure breaks down 
at three points: 


(i) The estimates of the regression coefficients, though unbiased, need not have least 
variance. 














rs 


>, 


n- OO + @ 











J. DURBIN AND G. S. Watson 411 


(ii) The usual formula for the variance of an estimate is no longer applicable and is liable 
to give a serious underestimate of the true variance. 

(iii) The ¢ and F distributions, used for making confidence statements, lose their validity. 

In stating these consequences of the violation of assumption (b) we do not overlook the 
fact, pointed out by Wold (1949), that the variances of the resulting estimates depend as 
much on the serial correlations of the independent variables as on the serial correlation of the 
errors. In fact, as Wold showed, when all the sample serial correlations of the x’s are zero 
the estimates of variance given by the least squares method are strictly unbiased whether 
the errors are serially correlated or not. It seems to us doubtful, however, whether this 
result finds much application in practice. It will only rarely be the case that the independent 
variables are serially uncorrelated while the errors are serially correlated. Consequently, we 
feel that there can be little doubt of the desirability of testing the errors for independence 
whenever the least squares method is applied to serially correlated observations. 

To find a suitable test criterion we refer to some results obtained by T. W. Anderson (1948). 
Anderson showed that in certain cases in which the regression vectors are latent vectors of 
eae where z is the 
column vector of residuals from regression, provides a test that is uniformly most powerful 
against certain alternative hypotheses. The error distributions implied by these alternative 
hypotheses are given by Anderson and are such that in the cases that are likely to be useful 
in practice ¥ = I, the unit matrix. These results suggest that we should examine the dis- 


matrices Y and ® occurring in the error distribution, the statistic 


tribution of the statistic r = = (changing the notation slightly) for regression on any set 


of fixed variables, A being any real symmetric matrix. 

In the next section we shall consider certain formal properties of r defined in this way, and 
in §3 its distribution in the null case will be examined. Expressions for its moments will be 
derived, and it will be shown that its distribution function lies between two distribution 
functions which could be determined. In §4 we return to discuss the question of the choice 
of an appropriate test criterion with rather more rigour and a specific choice is made. In the 
final section certain special properties of this test criterion are given. 


2. TRANSFORMATION OF r 


We consider the linear regression of y on k independent variables x, %2, ..., 2;,. The model 
for a sample of n observations is 


WY Lyx Vey vee ey Ay €) 
Y2 Vo Ten vee Ug Be + |, 
Yn Lyn Van +++ Un B k En 

or in an evident matrix notation, y = X8+e. 


The least squares estimate of B is b = {b,,bg, ...,b,} given by b = (X’X)-""X’y. 
The vector z = {z1,2g, ...,2,} of residuals from regression is defined by 
z= y—Xb 
= {I, —X(X’X)"X}y, 


where I, is the unit matrix of order n. 














412 Testing for serial correlation in least squares regression. I 
Thus z = {I,, —X(X’X)* X’} (XB +e) 
= {I, —X(X’X)"* Xe 
= Me say. 
It may be verified that M = M’ = M2; that is, M is idempotent.* 
We now examine the ratio of quadratic forms r = =, where A is a real symmetric 


, 


matrix. Transforming to the errors we have 


+e e’M’AMe_ e’MAMe 


r= 





MMe ¢Me ~ 

We shall show that there exists an orthogonal transformation which simultaneously reduces 
the numerator and denominator of r to their canonical forms; that is, there is an orthogonal 
transformation € = Hf such that 





It is well known that there is an orthogonal matrix L such that 
I,-,:| O 
Oo =O 
where I,,_, is the unit matrix of order n —k, and O stands for a zero matrix with appropriate 
n 
numbers of rows and columns. This corresponds to the result that > 2? is distributed as 
i=1 


x? with n —k degrees of freedom. Thus 
L’MAML = L’ML.L’AL.L’ML 


es : ° 3 : 2) 0 
~\o = o/\s, B/\o- 5) 


(= | | 
loo 
B, | B, 
where | ----- eos is the appropriate partition of the real symmetric matrix L’ AL. 
Si: .—6 
Let N, be the orthogonal matrix diagonalizing B,, i.e. 
Ni B,N,= /1 


Ve 
Va-k 


N, | O 
the blank spaces representing zeros. Then N = (3 + °] is orthogonal, so that H = LN is 


k 
orthogonal. 


* This matrix treatment of the residuals is due to Aitken (1935). 








so 


tric 


ces 
nal 


ate 








J. DURBIN AND G. 8S. Watson 413 





Consequently H’MH = N’L’MLN 
I, ; O 
- (: . ~2)N 
Oo :90 
= % 
“Vo fof’ 
so that H’MAMH = H’MH.H’AH.H’MH 
Vy 
Ve iO 
= oat i 
ce ree 0 68.60 
n—k 
DEAS 
Putting e = HC, we have r= — >; 
XSi 


This result can be seen geometrically by observing that e-MAMe = constant and 
e’Me = constant are hypercylinders with parallel generators, the cross-section of e’Me 
constant being an [nm — kj hypersphere. 


Determination of V1, V9; .-+;Vn—% 

By standard matrix theory 1, v9, ..., V,_;, are the latent roots of MAM other than k zeros; 
that is, they are the latent roots of M?A, since the roots of the product of two matrices are 
independent of the order of multiplication.* But M?A = MA since M? = M. Consequently 
V1, Vg, «++» Vn; are the latent roots of MA other than k zeros. 

Suppose now that we make the real non-singular transformation of the z’s, P = XG. 
Then M = I, —P(P’P)-'P’; that is, M is invariant under such transformations. We choose 
G so that the column vectors p,, De, ..., P;, of P are orthogonal and are each of unit length, i.e. 


ry _ {l(t =)) 
PiPs= lo a3) 
so that P’P = I... 


This amounts to saying that we can replace the original independent variables by a 
normalized orthogonal set without affecting the residuals. 


We have, therefore, 
M = I,,-(P, Pi +P2P2t ---+PePz) 


= (I, — PiP;) I, — PeP2) --- In — Pe Px) 

= M,M,...M,, say. 
Each factor M, has the same form as M, the matrix P being replaced by the vector p,; it is 
idempotent of rank n— 1 as can be easily verified. From the derivation it is evident that the 
M,’s commute. This is an expression in algebraic terms of the fact that we can fit regressions 
on orthogonal variables separately and in any order without affecting the final result. 


* See, for instance, C. C. Macduffee, The Theory of Matrices (Chelsea Publishing Company, 1946), 
Theorem 16-2, 











414 Testing for serial correlation in least squares regression. I 


Returning to the main argument we have the result that v,, v2, ...,V,_;, are the roots of 
M,...M,M,A other than & zeros. From the form of the products we see that any result 
we establish about the roots of M, A in terms of those of A will be true of the roots of M, M, A 
in terms of those of M, A. This observation suggests a method of building up a knowledge of 
the roots of M, ...M,M,A in stages starting from the roots of A which we assume known. 

We therefore investigate the latent roots of M,A, say 0,, 92, ..., 9,1, 0. These are the roots 
of the determinantal equation 


|1,0¢-M,A| = 0, 
i.e. |1,,6—(I,— Pipi) A| = 0. (1) 
Let T be the orthogonal matrix diagonalizing A, i.e. 
Ay 
T’AT=A= As ; 
An 
where A,, A, ...,A, are the latent roots of A. Pre- and post-multiplying (1) by T’ and T, we 
_ [1,—(n—hhA| = 0, 


where 1, = {1,;,1,9, ..-, 44,} is the vector.of direction cosines of p, referred to the latent vectors 
of A as axes. (Complications arising from multiplicities in the roots of A are easily overcome 
in the present context.) Dropping the suffix from 1, for the moment, we have 
|1,0-(I,—-l’)A| = 0. 
Writing out the determinant in full, 





6-—A, +A, LlaAg eee LlaAn 
, ‘ = 0. 
tie ie ww. O-A, +BA, 





Subtracting /,//, times the first row from the second row, /;/l, times the first from the third, 
and so on, we can expand the determinant to give the equation 


n n n 
II (6—A;)+ SBA, xX (6-A,) = 0. 
j=1 i=1 ji 
Reducing and taking out a factor @ corresponding to the known zero root of M, A gives 
n n 
x % IT (@—A;) = 0. (2) 
i=l j+i 
G,, A, ...,9,_, are the roots of this equation. 

We notice that when 1, = 0,6 —A, is a factor of (2) so that 0 = A, is asolution. Thus when p, 
coincides with a latent vector of A, 0,, 02, ...,9,,_; are equal to the latent roots associated with 
the remaining n — 1 latent vectors of A. In the same way if p, also coincides with a latent vector 
of A, the roots of M, M, A other than two zeros are equal to the latent roots associated with 
the remaining n — 2 latent vectors of A. Thus, in general, if the k regression vectors coincide 
with & of the latent vectors of A, 1,,V2,...,V,_, are equal to the roots associated with the 


remaining n—k latent vectors of A. This result remains true if the regression vectors are 
(linearly independent) linear combinations of k of the latent vectors of A, 











ine 


ve 


Ts 
ne 


~~ 


wT Vw we Fe 








J. DuRBIN AND G. S. Watson 415 


For other cases it would be possible to write down an equation similar to (2) giving the 
roots of M, M, A in terms of 6,, 4., ...,8,,_;, and so on. In this way it would be theoretically 
possible to determine 1,, V9, ...,V,_,- The resulting equations would, however, be quite 
unmanageable except in the latent vector case just mentioned. 


Inequalities on v4, Ve, ...5Vn—K 
We therefore seek inequalities on v,, V2, ...,V,_;- For the sake of generality we suppose 
that certain of the regression vectors, say n —k—s of them, coincide with latent vectors of 
A (or are linear combinations of them). We are left with s of the v’s for which we require 
inequalities in terms of the remaining s + kA’s. We renumber them so that 


VySVgg... SVs, 

Ay SAg< «.. SAgyye 
We proceed to show that ASV; <A, (6 = 1,2,...,8). (3) 

It is convenient to establish first an analogous result for the full sets of v’s and A’s. 

We therefore arrange the suffixes so that 

Vy Vg S00. SV y_py 

Ay <Ag<... Ay. 
We also arrange the 6’s so that 6,<0,<...<¢6,_1. 


It was noted above that if /, = 0, A, is a root of (2). Also if any two of the A’s, say A, and 
A,+1, are equal, then A, = A,,, is a root of (2). These are the only two cases in which any of the 
A’s is a root of (2). 

For the remaining roots let 


Then FQ) = BTL A.A), 
so that if f(A,) > 0 then f(A,,,) <0, and if f(A,) <0 then f(A,,,) 20. Since f(@) is continuous 
there must therefore be a root in every interval A,<@<A,,,. Thus 
A:<9;<Ayy (6 = 1,2,...,.n—1). (4) 

To extend this result we recall that M,A has one zero root in addition to 9,, 9, ..., An_y- 

Suppose 6,< 0<4,,,; then the roots of M,A can be arranged in the order 
Oy <Og< 002 SO, <0 O41 < 00S Ons. 
Let the roots of M,(M, A) be ¢,, 5, ...,¢,_, together with one zero root. Then by (4) 
91,< $1 <Og< b2<...< <9, <0< byr<..-. 

But M, M,A certainly has two zero roots, since M, M, A has rank at most n — 2. Thus either 
¢, or ¢,,, must be zero. Rejecting one of them and renumbering we have 


A<G:<Agg (6 =1,2,...,n—2). 
Applying the same argument successively we have 
AGS <Any (¢ = 1,2,...,.n—k). 
Deleting cases of equality due to regression vectors coinciding with latent vectors of A we 
have (3). 
The results of this section will be gathered into a lemma, 











416 Testing for serial correlation in least squares regression. I 
Lemma. If z and € are x x 1 vectors such that z = Me, where M = I, — X(X’X)-*X’, and 


ifr = =”, where A is a real symmetric matrix, then 


(a) There is an orthogonal transformation e = HC, such that 





where 7, V2, ..., Vp_, are the latent roots of MA other than k zeros; 
(6) Ifn—k—=s of the columns of X are linear combinations of n — k —s of the latent vectors 


of A, then n—k-—s of the v’s are equal to the latent roots corresponding to these latent 
vectors; renumbering the remaining roots such that 


VzSVgd... LV, 
Ay SAg SS... SAgips 
then Ag<¥%<Aux, (6 = 1,2,...,8). 
We deduce the following corollary: 
CoROLLARY rr<r<ry, 


8 n—k 
>» A+ DAG 
= i=1 i=s+1 


where rr — 





i=1 


and St Seem , 


This follows immediately by appropriate numbering of suffixes, taking A,,;,,,...A, as the 
latent roots corresponding to the latent regression vectors and arranging the remainder so 
that A; < A;,,. The importance of this result is that it sets bounds upon r which do not depend 
upon the particular set of regression vectors. r,; and ry are the best such bounds in that they 


can be attained, this being the case when the regression vectors coincide with certain of the 
latent vectors of A. 


3. DISTRIBUTION OF r 


It has been pointed out that when the errors are distributed independently of the independent 
variables the latter can be regarded as fixed. There is one special case, however, in which it is 
more convenient to regard the x’s as varying. We shall discuss this first before going on to 
consider the more general problem of regression on ‘fixed variables’. 

The case we shall consider is that of a multivariate normal system. In such a system the 
regressions are linear and the errors are distributed independently of the independent 
variables. It will be shown that if y, x,, ...,;, are distributed jointly normally such that the 
regression of y on the 2’s passes through the origin, and if successive observations are 
independent, then r is distributed as if tine residuals z,,...,z, were independent normal 
variables. That is, the regression effect disappears from the problem. Similarly, when the 











a 


nd 


nt 








J. DuRBIN AND G. 8. Watson 417 


regression does not pass through the origin r is distributed as if the z’s were residuals from the 
sample mean of m normal independent observations. 

This is perhaps not a very important case in practice, since it will rarely happen that we 
shall wish to test the hypothesis of serial correlation in the errors when it is known that 
successive observations of the x’s are independent. Nevertheless, it is convenient to deal 
with it first before going on to discuss the more important case of regression on ‘fixed 
variables’. 

To establish the result we consider the geometrical] representation of the sample and 
observe that the sample value of r depends only on the direction in space of the residual 
vector z. If the x’s are kept fixed and the errors are normal and independent, z is randomly 
directed in the [n — k] space orthogonal to the space spanned by the x vectors. If the x’s are 
allowed to vary z will be randomly directed in the [n] space if and only if the 2’s are jointly 
normal and successive observations are independent (Bartlett, 1934). In this case the 
direction of z is distributed as if z,,2,,...,2,, were normal and independent with the same 
variance. Thus when y, x, ..., 2, are multivariate normal such that the regression of y on x 
passes through the origin, r is distributed as if the residuals from the fitted regression through 
the origin were normal and independent variables. 

In the same way it can be shown that if we fit a regression including a constant term, r is 
distributed as if the z’s were residuals from a sample mean of normal independent variables 
whether the population regression passes through the origin or not. 


Regression on ‘fixed variables’ 
To examine the distribution of r on the null hypothesis in the ‘fixed variable’ case we 


assume that the errors €,, €3, ..., €,, are independent normal variables with constant variance, 
i.e. they are independent N(0, o?) variables. Transforming as in §2 we have 





Since the transformation is orthogonal, ¢,, {,...,¢,-, are independent N(0,c*) variables. 
It is evident that the variation of r is limited to the range (v,, v,_,). 

Assuming the v’s known, the exact distribution of r has been given by R. L. Anderson 
(1942) for two special cases: first for nm —k even, the v’s being equal in pairs, and second for 
n—k odd, the v’s being equal in pairs with one value greater or less than all the others. 
Anderson’s expressions for the distribution function are as follows: 

m (7; ee r’ )Xn—k)-1 


P(r>r’)=> 


(Tin+t < r’ < Tm) 
i=1 a; 


where 
n—k even: the v;,’s form }(n—k) distinct pairs denoted by 7;>7,>...->Tm-,» and 


i(n—k) 
a;= [I (t%-T7;), 
ji 


n—k odd: the v,’s form }(n—k-—1) distinct pairs as above together with one isolated 
i(n—k-1) 
root 7 less than all the others and a; = A ' He (7%; -7;) (14-7). 
j+i 
The expression for n—k odd, t >7, is obtained by writing —r for r. 
Formulae for the density function are also given by Anderson. 
Biometrika 37 27 











418 Testing for serial correlation in least squares regression. I 


For the case in which the v’s are all different and n — kis even the [$(n — &) — 1]th derivative 
of the density function has been given by von Neumann (1941), but up to the present no 
elementary expression for the density function itself has been put forward. Von Neumann’s 
expression for the derivative is as follows: 

qin-#)-1 
Gnawa f(r) =0, meven 


(—ryren(*2t)f 


{ n-—-k ’ 
n |(-e-»0) 
for Vy <1 < Vi, m = 1,2,...,.n—k—-1. 


To use these results in any particular case the v’s would need to be known quantities, 
which means in practice that the regression vectors must be latent vectors of A. In addition, 
the roots associated with remaining n —k latent vectors of A must satisfy R. L. Anderson’s 
or von Neumann’s conditions. 

The results can also be applied to the distributions of r; and ry, the lower and upper 
bounds of r, provided the appropriate A’s satisfy the conditions. Using the relations 


F,(r) > F(r) > Fo(r), (5) 
where F;,, and Fy are the distribution functions of r; and ry we would then have limits to the 


distribution function of r. The truth of the relations (5) can be seen by noting that r; and r are 
in (1, 1) correspondence and r; <r always. 


m odd 





Approximations 

R. L. Anderson’s distribution becomes unwieldy to work with when n—k is moderately 
large, and von Neumann’s results can only be used to give an exact distribution when n—k 
is very small. For practical applications, therefore, approximate methods are required. 

We first mention the result, pointed out by T. W. Anderson (1948), that as n — k becomes 
large r is asymptotically normally distributed with the mean and variance given later in this 
paper. For moderate values of n—k, however, it appears that the distributions of certain 
statistics of the type r are better approximated by a f-distribution, even when symmetric.* 
One would expect: the advantage of the # over the normal approximation to be even greater 
when the v’s are such that the distribution of r is skew. For better approximations various 
expansions in terms of #-functions can be used. One such expansion was used for most of 
the tabulation of the distribution of von Neumann’s statistic (Hart 1942). Another method 
is to use a series expansion in terms of Jacobi polynomials using a f-distribution expression 
as weight function. (See, for instance, Courant & Hilbert,} 1931, p. 76.) The first four terms 
of such a series will be used for calculating some of the bounds to the significance points of 
r tabulated in our second paper. 

Moments of r 


To use the above approximations we require the moments of r. First we note that since 
r is independent of the scale of the ¢’s we can take a? equal to unity. We therefore require the 
n—k —k 
moments of r= u/v, where w= > v,¢? and v = C?, £1, Sa, ---» Sp_~ being independent 
i=1 i=1 
N(0, 1) variables. 


* See, for instance, Rubin (1945), Dixon (1944), R. L. Anderson and T. W. Anderson (1950). 
¢ Note, however, the misprint: x*(1—)®-* should read 2*-1(1 —=x)?-*. 








inc 


— = 


TPT wae 


a OE ORE | | 

















J. DuRBIN AND G. S. Watson 419 


It is well known (Pitman, 1937; von Neumann, 1941) that r and v are distributed 
independently. Consequently 
E(u’) = E(rev*) = Bir’) E(v*), 
_ E(u’) 
so that E(r°) = E(v*) ? 
that is, the moments of the ratio are the ratios of the moments. 

The moments of u are most simply obtained by noting that wu is the sum of independent 
variables v,¢?, where ¢? is a y? variable with one degree of freedom. Hence the sth cumulant 
of uw is the sum of sth cumulants, that is 

n—k 


K,(u) = 2-(s—1)! bv, 
i=1 
sine K(v,C2) = 2°-(8— 1) 14. 
In particular K,(u) = =v, Kp(u) = 2Dr7. 


The moments of wu can then be obtained from the cumulants. 
The moments of v are simply those of x? with n — k degrees of freedom, i.e. 


E(v) = n—-k, 
E(v?) = (n—k) (n—k+ 2), ete. 
n—k 
Hence E(r) = 4 = sa yY%,=) say. (6) 
n—kiry 


To obtain the moments of 7 about the mean we have 
, X(v; bar D) G =r U 
jeter  satee- 
As before the moments of r—; are the moments of w’ divided by the moments of v. The 
moments of w’ are obtained from the cumulants 


K,(w’) = 28-1(8— 1) "SO, —v/. 
i=1 


say. 


In this way we find 
22(v;—)? 
(n—k) (n—k+2)’ 
8Xx(v;,—7)3, 
M3 = ln —k) (n—k +2) (n—k+4)’ 
48D(v;—d)* + 12{2(v, —7)?}" 
Ma = (n—k) (n—k+2)(n—k+4) (n—k+6)° 


varr = fl, = 














It must be emphasized at this point that the moments just given refer to regression through 
the origin on k independent variables. If the regression model includes a constant term, that 
is, if the calculated regression includes a fitted mean, and if, as is usual, we wish to distinguish 
the remaining independent variables from the constant term, then k must be taken equal to 
k’ +1 in the above expressions, k’ being the number of independent variables in addition to 
the constant. We emphasize this point, since it is k’ that is usually referred to as the number 
of independent variables in such a model. 

The expressions given will enable the moments of r to be calculated when the v's are known. 
In most cases that will arise in practice, however, the v’s will be unknown and it will be 


27-2 











420 Testing for serial correlation in least squares regression. I 


impracticable to calculate them. We therefore require means of expressing the power sums 
=v? in terms of known quantities, namely the matrix A and the independent variables. 
To do this we make use of the concept of the trace of a matrix, that is, the sum of its 
leading diagonal elements. This is denoted for a matrix S by tr S, S being of course square. 
It is easy to show that the operation of taking a trace satisfies the following simple rules: 
(2) tr(S+T) = trS+trT, 
(6) tr ST = trTS whether S and T are square or rectangular. 
From these rules we deduce a third: 


(c) tr(S+T)? = trS¢+ (2) tr S¢-!'T + (2) tr S¢°T? +... + trT?, 


m 
when S and T are square. In addition, we note that trS = > o;, where 0}, 7, ..., 7», are the 
i=1 


eeey m 


\ 


m 
latent roots of S, and in general that trS? = 5 of. 


i=1 © 
Thus we have immediately SE 4 = tr(MA), 
i=1 
SINCE V,, Vo, ...,V»_, together with k zeros are the latent roots of MA. 

In cases in which the independent variables are known constants it is sometimes possible 
to construct the matrix MA directly and hence to obtain the mean and variance of r in 
a fairly straightforward way. ’ 

For models of other types in which the independent variables can take arbitrary values 
further reduction is needed. For the mean we require 


Lv, = trMA = tr{I, —X(X’X) 1 X}A 
= tr A—tr X(X’X)-!X’A by rule (a) 
= tr A—tr X’AX(X’X)-! _ by rule (6). (8) 
The calculation of this expression is not as formidable an undertaking as might at first sight 
appear, since (X’X)~! will effectively have to be calculated in any case for the estimation of 
the regression coefficients. It is interesting to note incidentally that the matrix X’AX(X’X)-! 


in the expression is a direct multivariate generalization of the statistic r. 
For the variance we require 


=v} = tr (MA)? = tr{A —X(X’X)-1X’A}? 


= tr A*?—2tr X’A*X(X’X)-! + tr {X’AX(X’X)—}2, (9) 
by rules (5) and (c). 
Similarly 
x vt = tr A?— 3 tr X’A®X(X’X)-—1 
+ 3 tr {X’A*X(X’X)— X’AX(X’X)—}} + tr {K’AX(X’X)-}}3, (10) 


L vt = tr At— 4 tr X’A* X(X’X)- + 6 tr {K’ A®X(X’X)-1 X’AX(X’X)“}} 


— 4tr [X’A*X(X’X)-1 {K’AX(X’X)-}}?] + tr {X’AX(X’X)-}4, (11) 
and so on. 


When the independent variables are orthogonal these expressions can be simplified some- 
what since X’X is then a diagonal matrix. Thus 


k x’ Ax. 
tr X’AX(X’X)-1 = = 





> 














x, sta 
sumer 
tr xX’. 


Thus 
the r 


the! 


fu 


18 














mt 





J. DURBIN AND G. S. Watson 421 


x, standing for the vector of sample values of the ith independent variable. Each term in the 
summation has the form r in terms of one of the independent variables. Similarly for 
tr X’A2X(X’X)-1, tr X’A°X(X’X)-!, etc. We have also 
: ' 2 k ' 2 

3S =) +9 (x; Ax;) 


—1\2 — 
tr {X’AX(X'X)}? = ¥ (=S— 


ify XX, XGX; 
Thus when the regression vectors are iia ts the following formulae enable us to calculate 


the mean and variance of r: or 


5 ad 
Lv, = trA— + ee, aye 


i=1 





(12) 





x; X; 
ka (R AX,) k (x! 
= = trA?— 25% Xi APK,, . § Aa) Sen 


; 2 
4 on & 2 | x; X; 2% xX; 1X, X; 
The mean and variance are obtained ~ substituting these values in (6) phe (7). 
Similar results apply when X is partitioned into two or more orthogonal sets of variables. 
For instance, when X consists of the constant vector {c,c, ...,c} together with the matrix 


X of deviations from the means of the remaining k—1 variables, i.e. 
%,; = X4—X,(i = 2,3,...,4; j = 1,2,..., 2), 


then trX’AX(X’X)-! 1 EA ek aka ky 
2i’ AX(X’X) X’ Ai 


- + tr {X’AX(X’K)-2, 





tr (X’AX(X’X)-2}2 = =) + 


where i is the equiangular vector {1, 1, ..., 1}. When this is a latent vector of A corresponding 
to a latent root of zero, i‘A = O. We then have the important result that (8)-(11) apply 
without change except that the original variables X are replaced by the deviations from their 
means X. This result holds whenever x’Ax is invariant under a change of origin of x. 

Before closing this treatment of moments we should mention one difficulty in using them 
for obtaining approximations in terms of /-distributions and associated expansions. In 
constructing such approximations one usually knows the range within which the variable 
is distributed. In the present problem, however, the range is (v,,v,,_;), which will often be 
unknown and impracticable to determine. In such cases it will accordingly be necessary to 
use approximations to v, and v,_; before the distributions can be fitted. 


Characteristic function of u and v 


An alternative method of obtaining the moments of u and vis to use their joint characteristic 
function. This is given by 


tte) = Gayaow | ~- [xP ih BG+ i BG- IA) MG... dys 


n—k 


= ne - 2v it, -_ 2it,)-* 


n—k 

= (1—2it,)*#* [] (1 —2it, — 2v,it,)-+ (1 — 2it,)-¥* 
j=1 : 

= (1—2it,)** | 1,,(1 — 2it,) — 2it, MA |-#, 


since v,...V,_;, together with k zeros are the roots of the equation | I,v—MA| = 0. 








422 Testing for serial correlation in least squares regression. I 


A more manageable expression can be obtained by considering first the case of a single 
independent variable. The characteristic function ¢,(t,t,) is then given by 








re tl (1 — 26,it, — 2it,), (13) 
where 6,, 05, ...,9,_, are the roots of the equation 
SBT (0- A;) = 0. (2 bis) 
wt set 
Consequently a (0-0,) = = 3 I (0—-A,) 
=I (A-A ar, 


for all values of 0 except A,, Ag, ...,A,. From (13) 
a7 eine TE (So it 
1 


j=1 2tt, ; 











1 — Qt 
n—-1 as , 
— Hi ( 2ity 0,) 3 Te _) 
j 
n “ 
= T1 (1 —2Ayit,— 2ity) 5 (BA, it, Baty) (14) 


The left-hand factor of this expression is the characteristic function of u and v that would 
be obtained if the z’s were independent normal variables, the right-hand factor giving the 
modification due to the fitting of a regression on a single independent variable. 





, 1 F 
To reduce the expression further we note that , —(j = 1,2,...,”) are the latent 
1 — 2A; it, — it, 


roots of the matrix {(1 Ls it.) I, fe 2it, A} = B-1 say. 


Moreover, the latent vectors of B-! are the same as those of A so that /,,/,,...,1,, are the 
direction cosines of the vector x relative to these latent vectors. Consequently 
: R x’B-'x 
1 2A,it,—2it,  x'x 
where x is the independent variable concerned. Also 





’ 


n 
j=1 
x = 
Thus a7 = | B|—_—. (15) 
It is interesting to note that the second factor won the general form r. 


By a direct extension of this argument it can be shown that for regression on k independent 
variables the characteristic — is given i? 


x, By! x, 
B , 
= |B,| oe <x. 


the M,’s being defined as in §2. This result could also be written down directly given (15) in 
virtue of the reproductive property of the products ...M,M,A mentioned in §2. 

Putting ¢, = 0 we obtain the characteristic function of u. The cumulants and hence the 
moments can then be obtained by the expansion of log ¢. 





~ _ _——— 





Tod 
hypoth 
mind i 
expont 


hypotl 


where 
Ej—1> & 
Ith 
unifo! 
certai 
obtai: 
give 
Th 
coine 
whicl 





igle 


8) 


it 














J. DURBIN AND G. S. Watson 423 


4. CHOICE OF TEST CRITERION* 


To decide upon a suitable test criterion an important consideration is the set of alternative 
hypotheses against which it is desired to discriminate. The kind of alternative we have in 
mind in this paper is such that the correlogram of the errors diminishes approximately 
exponentially with increasing separation of the observations. A convenient model for such 
hypotheses is the stationary Markoff process 


€,=pe,,t+u, (i=...—1,0,1,...), (16) 
where |p|<1 and u,; is normal with mean zero and variance o* and is independent of 
€;_1) €j-g) --- ANd U;_1, U;_», .... The null hypothesis is then the hypothesis that p = 0 in (16). 

It has been shown by T. W. Anderson (1948) that no test of this hypothesis exists which is 
uniformly most powerful against alternatives (16). Anderson also showed, however, that for 
certain regression systems with error distributions close to that given by (16) tests can be 
obtained which are uniformly most powerful against one-sided alternatives (16) and which 
give type B, regions for two-sided alternatives (16). 

These regression systems include cases in which the regression vectors are constant vectors 
coinciding with latent vectors of a matrix © (or with linear combinations of k of them) and in 
which the error distributions have density functions of the form 


1 , , 
Kexp| —sra{(1 +p") €'e—2pe'08) |. (17) 
For such cases the uniformly most powerful test of the hypothesis p = 0 against alternatives 


ie z'@z : 
p> 0 is given by r>7 9, where r = chee , Z being the vector of residuals from least squares 


regression, and 7, being determined to give a critical region of appropriate size. For two-sided 
alternatives to p = 0 the type B, test is given by r<r,, r>1r3, where r, and r, are determined 
so as to give a critical region of appropriate size and to satisfy the relation 


[0 dr = E(r) [0 dr, 


p(r) being the density function of r in the null case. 
We recall that whatever the regression vectors, 


TESTS, (18) 


where 7, and rz are defined in the Corollary, $2. Now r, and ry have distributions in the null 
case identical with distributions of r obtained from residuals from regressions on certain 
latent vectors of the matrix A. Thus if we put © = A in (17) we can say that when the lower 


‘ 


a Az . pe 
bound r, in (18) is attained (or the upper bound), the statistic r = —— gives a test which is 


uniformly most powerful against one-sided alternatives (16) and which is of type B, against 
two-sided alternatives. , 
The error distribution for the stationary Markoff process (16) has the density function 


l n . n 
K exp [ - 3o8 {a +p*) dei — pei + en) — 2p Seer} |. (19) 


* This section is based on the treatment given by T. W. Anderson (1948). 














424 Testing for serial correlation in least squares regression. I 
Taking e’'Oc = > €;€;_, in (17) gives a density function 
i=2 
l nN n 
page ae 2 oe ¥ 
Kexp| ~5-3((1+0) 34-2 See} |, (20) 
n l n 
while taking e'Oe = > c?- 5 > (€;—€;_,)? in (17) gives a density function 
i=1 i=2 


Kexp| ~ 565 {(1+p) Det-plet+ et) — 29 See. |. (21) 


These are both close to (19). Thus following Anderson we conjecture that either value of 
0 would give a good statistic r for testing against alternatives (16). Between the two statistics 
there is not much to choose. We ourselves have adopted a slight modification of the second, 
partly for reasons of computational convenience and partly because of similarity to von 
Neumann’s statistic d?/s? (1941) already well known to research workers. 
The statistic we have adopted is defined by 
n 
X (2-2-1)? 
d= jn? —____. 
x zi 


i=1 


> 


2 2 ’ 
which is related to : by . = — This is a special case of the general statistic r = a 


discussed in §2 and §3, in which 


fe at eee 
~~} 28-2 
io Se See | 

Am Aya} €-@-=1r ‘3 
Be wack ea ip 
pag ae” es 


In the notation of the previous paragraph we would take © = I — 4A, to give the density (21). 
Now the latent vectors of the matrices A, and 9 in this equation are the same. Thus when the 
regression vectors are latent vectors of A, the statistic d provides a uniformly most powerful 
test against one-sided alternatives (21). In particular the test given by d when the bounds 
ry, and ry are attained is uniformly most powerful. 

The main alternative to using d or a related statistic as a test criterion would be to use one 
of the circular statistics such as -. 

Pe 2-1 


e n 


2X 2 


i=1 


r.= 


S (ep— 2-4)” 
t=1 


or d, =— 


n 
D2 
i=1 


—_ 














20) 


21) 


of 
bics 
nd, 


7on 


a 














J. DuRBIN anpD G. S. Watson 425 


where we define z,=z, in each case. T. W. Anderson (1948) has shown that r, and d, give 
uniformly most powerful tests against one-sided alternatives in the circular population having 
a density function 


1 n n 
Kexp| — 33 {+0) Eet-2 Zeer |, (22) 
t=1 i=1 


where €)=€,. 7, was the statistic adopted by R. L. Anderson & T. W. Anderson (1950) for 
testing the residuals from regression on a Fourier series. 

The disadvantage of r, and d, is that (22) is not so close to (19) as (20) or (21). The advantage 
is that since the latent roots of the associated values of A are equal in pairs, the results of 
R. L. Anderson (1942) can sometimes be used to obtain exact distributions in the null case. 
The roots of A,, on the other hand, are all distinct. We conclude that d or a related non- 
circular statistic would seem to be preferable whenever an approximation to the distribution 
is sufficient, but that a circular statistic would seem to be preferable if exact results are 
required at the expense of some loss of power. We mention that the computations involved 
in using Anderson’s exact distribution become very tedious as the number of degrees of 
freedom increases. 

The next question that arises is how good these statistics are as test criteria in cases in 
which the regression vectors are not latent vectors. Such cases are of course by far the more 
frequent in practice. It is evident that we can expect the power of the test to diminish as the 
regression vectors depart from the latent vectors, since the least squares regression coefficients 
are not then maximum likelihood estimates in the non-null case. Thus any test based on least 
squares residuals cannot even be a likelihood ratio test. Against this three points can be made. 
The first is that we still have a valid test, though possibly of reduced power. Secondly, it is 
desirable on grounds of convenience to have a test based on least squares residuals even 
though it is not an optimal test. Thirdly, the statistic r necessarily lies between the bounds 
r,, and ry and when these bounds are attained the test is optimal. We note also that it is only 
for the latent vector case that the distribution problems have been approached with any 
success. 


5. SOME SPECIAL RESULTS 


To obtain the moments of d we need the powers of Aj. Because of the symmetry of these 
matrices they are completely specified by the top left-hand triangle. Thus we can write 


A,=1 -1 0 


S a 
2 
We find A2=2 -3 1 0 
6 ed l 
.. = 
6 
A8=5 —9 ‘; =i 0 oO 
19 —15 a a. 
20 —15 & ax 
20 -15 6 
Ag=14 -28 20 ~7 1 00 
ee '~-8 1 @ 
7 -56 2 -8 1 
70 -56 28 1 











426 Testing for serial correlation in least squares regression. I 


Rather than use these matrices as they stand, however, it will probably be more convenient 
to proceed by finding the sums of squares of the successive differences of the z’s. Denoting 
the sth differences by A*z we have 

n—1 
z'Agz = p» (Az,)?, 
i=1 
2’ Adz = X(A%,)? + (2, — 29)? + (Zp-1 — Zn)”, 
2’ ASz = X(A8z,)? + 42? + 922 + 22 — 122, 2. — 62423 + 42,23 +a similar expression in z,, 2,1) Zn—2 
2’ Agz = X(Atz,)? + 1322 + 4523 + 1723 + 23 — 482, 2 — 542925 — 82524 + 282, 25 + 12252, 
— 6z,2,+ a similar expression in Z,,, 2,1, Z,—2) Zn—3- 








(23) 
For the circular definition of d, i.e. 
. 2 
(2; S87 2-1) ’ 
ic e Ove 2 
d, = ee ” => ais with “0 Sens 
x 2 
i=1 
the correction terms disappear, giving 
n 
2'AS.z = = (A*z;)*, where 2_;=2,_; 
The latent roots of A, are given by 
m( 7-1) . ; 
= 2{1—cos™=")} (j = 1,2, ...,2) (24) 


(von Neumann 1941). The first four power sums are: 

n \ 

D A, = An—1), | 

j=1 
LAF = 2(38n—4), > (25) 
DAS = 4(5n—8), 
XA} = 2(35n — 64). 

The latent vector corresponding to the zero root A, is {1,1,..., 1}, which is the regression 


vector corresponding to a constant term in the regression model. For regressions with 


a fitted mean, therefore, we need only consider the remaining n — 1 A’s which we renumber 
accordingly so that 


Aj = 2(1-cos™2) Te eee 
With these A’s we have from the Corollary, §2, 
d,<d<dy, (26) 


where d, = — (27) 


fant ¢ 
dy a 2s > (28) 


k’ being the number of independent variables in the model in addition to the constant term. 


is 


NS ee 
~ 


*. 
ee eee 


_ o.—— 


VS 


ee ee ———$—— TT 7 A NNN, I ge 


pe a - 


ee OS 


J. DURBIN AND G. S. Watson 427 
With the error distribution assumed in §3 the limits of the mean of d are given by 
2 a! 1) 
E(d) < E(dy) = 2— Pe eS jer ro 
2 n—k’—1 1 4 


> E(d;) = oF > ian” ts 


We state without proof the limits of the variance of d: 











< ee = ee 
OTT hee es Be ee 
as at oe 8(n—k’ — 2) (n—k') m 
‘al . cea eA... 2 ) 1 
SG —l (mk +1) », cos * + ik —1(n—k 41) on 
(n—k’ even), 
16 (n—1) nj 





os? —t 
Qu =Nen¥ ee = (n odd), 


16 (n—2) 15 
> + ea 
GF -De_P the. . (nm even). 





To give some idea of how the distribution of d can vary for different regression vectors we 
give a short table of the limiting means and variances. 



































| 
=i e’=3 r=5 
Pach , 7 
| Mean Variance Mean | Variance Mean | Variance 
| 
Lower 1-89 0-157 165 | 0-101 1-38 0-048 
n=20 Upper 2-11 0-200 2-35 0-249 2-62 0-313 
Lower 1-95 0-090 1-84 0-077 1-72 0-063 
n=40 Upper 2-05 0-100 2-16 | O-lL1 2-28 0-124 
Lower 1-97 0-062 | 1-89 0-057 1-82 0-051 
n= 60 Upper 2-03 0-067 | 2-11 0-071 2-18 0-077 
| 

















We wish to record our indebtedness to Prof. R. L. Anderson for suggesting this problem to 
one of us. 


REFERENCES 


Arrken, A. C. (1935). Proc. Roy. Soc. Edinb. 55, 42. 

ANDERSON, R. L. (1942). Ann. Math. Statist. 13, 1. 

ANDERSON, R. L. & ANDERSON, T. W. (1950). Ann. Math. Statist. 21, 59. 
AnpERson, T. W. (1948). Skand. AktuarTidskr. 31, 88. 

Barttett, M. 8. (1934). Proc. Camb. Phil. Soc. 30, 327. 

CocurangE, D. & Orcutt, G. H. (1949). J. Amer. Statist. Soc. 44, 32. 








428 Testing for serial correlation in least squares regression. I 


Courant, R. & Husert, D. (1931). Methoden der Mathematischen Physik. Julius Springer. 

Drxon, W. J. (1944). Ann. Math. Statist. 15, 119. 

FisHEr, R. A. (1946). Statistical Methods for Research Workers, 10th ed. Oliver and Boyd. 

Harr, B. I. (1942). Ann. Math. Statist. 13, 207. 

Moran, P. A. P. (1950). Biometrika 37, 178. 

von NEuMANN, J. (1941). Ann. Math. Statist. 12, 367. 

Prrman, E. J. G. (1937). Proc. Camb. Phil. Soc. 33, 212. 

Rus, H. (1945). Ann. Math. Statist. 16, 211. 

Wo p, H. (1949). ‘On least squares regression with auto-correlated variables and residuals.’ (Paper 
read at the 1949 Conference of the International Statistical Institute.) 








se 


~_ — 


aper 





[ 429 ] 


DISTRIBUTION OF ‘STUDENT’-FISHER’S ¢ IN SAMPLES FROM 
COMPOUND NORMAL FUNCTIONS 


By HANNES HYRENIUS 
Lund, Sweden 


I. INTRODUCTION 


1. Investigations of sampling distributions and methods of statistical analysis, based on 
more general assumptions than that of a normal parent population, are carried on along 
several different lines. Apart from a number of experimental sampling investigations (see 
p. 441), three main principles may be distinguished as follows: 

A. Samples drawn from specified non-normal variation schemes. 

B. Transformation of variables. 

C. Non-parametric methods. 


The present paper deals with problems under category A for which the following specifica- 
tions may be given: 

(a) Each individual of the sample drawn from its specific normal parent population 
(‘individual normal universe’). 

(6) All individuals of the sample drawn from the same non-normal parent population 
(‘common non-normal universe’). 

(c) Each individual of the sample drawn from a specific non-normal population (‘in- 
dividual non-normal universe’). 


2. The approach to the problem of non-normality according to principle (a) has been used 


in a number of studies by C.-E. Quensel, H. Robbins and M. Weibull. Quensel (1944) showed 
that if the individuals of the sample are drawn from different normal universes with common 
mean but with different variances, ‘Student’-Fisher’s ¢ will be distributed with less variance 
than in the normal theory. In a subsequent article (1947) the same author made Fisher’s 
variance-ratio z the subject of a similar treatise. 

H. Robbins (1948) deduced the distribution of ¢ under the assumption that the individual 
normal universes have different centres but the same variance. In a recent study by 
M. Weibull (1950), the same model is treated more rigorously and partly along other lines. 
The distributions of ¢t and z are derived. In the case oft, it is proved that the variance and the 
fourth cumulant are less than in the normal theory distribution. 

Principle (b), common non-normal universe, has been used in a number of studies during 
the last two decades. Only for a few distributions has it been possible to derive sampling 
distributions for any size of sample. Many of the studies are restricted to very small sizes 
and often deal with rather specific parent distributions such as the rectangular and the 
triangular distributions. Reference may be made here to works by G. A. Baker (1932), 
M. S. Bartlett (1935), J. Laderman (1939), V. Perlo (1933) and P. R. Rider (1929, 1931). 

Among more general attempts is the use of Gram-Charlier’s A-series. R. C. Geary (1936) 


deduced the ¢t-distribution for samples from an A-series of two terms (se) + 39%2)). The 


problem was also treated by C.-E. Quensel (1938) and was further developed in two recent 











430 Distribution of ‘ Student’-Fisher’s t 


articles by R. C. Geary (1947) and A. K. Gayen (1949), taking into consideration more terms 
of the A-series. 

Another type of non-normal universe, for which it has been possible to derive the t-dis- 
tribution, is that of a compound normal distribution, consisting of the weighted sum of a 
number of different normal functions [2p,;¢,(x)]. Using this model, G. A. Baker (1932) 
presented the distribution of ¢ for samples of two items drawn from a composition of two 
normal functions with different centres. The distribution for any size of sample was derived 
by the present author in 1949. The following paper gives a generalization to an arbitrary 
number of normal components with varying means and/or variances and any size of sample. 

The idea of using individual universes in drawing the individuals of a sample is of a fairly 
recent date. For that reason no results have so far been obtained by combining the two first 
methods into principle (c). 

3. It was pointed out by Quensel that the model, here called individual normal universe, 
is present in case of time variation within an individual; this can be considered to be fairly 
frequent in medical and biological analyses. Certain experimental problems in agriculture 
will also possess the same structure of variation. 

The compound normal functions, on the other hand, offer good models of industrial pro- 
duction where products are subject to variations within and between machines and/or 
within and between workers. This scheme is also present in certain problems of educational 
statistics, where the persons tested may be thought of as drawn from disparate strata of the 
population according to heredity and education. — 


II. SAMPLING FROM f(x) = Xp, d[x; w,, A] 
1. The parent population is given by the compound normal function 


f(x) = X PiPles Mo Al (1) 
Introducing M;, = + pie, (2) 


and calculating M,, from Mj, by the relations between moments about the mean and about 
zero, the distribution can be most easily characterized by its cumulants 


kK, = M;, Kp =A+M,, } 
Kj3=M;, k,= M,—3M}, etc 
2. The characteristic renee (c.f.) of Z = Xx/N and vy = Ua?/N is given by 


r mad 
OUtet,)= Ulhat = b —igey Mt [1-74] 
bs hole 


sone 2a] 


Here Kk? denotes the rth of all possible combinations of n out of N values. In other words, 
r stands for one separate combination of the components ¢. 


r) yy, » 42 
By writing a, = a b, = 7 eS, 


(3) 


HPA + 2Ny;t, + we ft 








(5) 


and C, = If ay (6) 


mer 





the 


the 


19 


of 


orms 


-dis- 
of a 
932) 
two 
ived 
rary 
ple. 
uirly 
first 


arse, 
irly 
ture 


pro- 
d/or 


onal 
the 


(1) 
(2) 


out 


(3) 


(4) 


rds, 


(5) 
(6) 








Hannes HyRENIUS 431 
the c.f. can be reduced to the simple form 





-4N 2 y 
oaks afi 2A) TH oy [A+ 2Naety + 2B, 4) - 
NW on| 1-7; 
-F hh 
Using the notation y,= ‘tb, —a?], (8) 


the frequency function (fr.f.) of and v, = © —X* can be written as 


e—Nvgi2a yi N—1)+m-1 
a2 Wes m2 HN—1)+m * 


This is formally the same as was already derived for the case of two components (Hyrenius, 
1949). 


Alternative forms are 





F[%, v,] = x C9 Z; a,, nerd (9) 











FR») = ZC] #0, 2] & (—1mlben arr ig woe F], (92) 
(r) a Nim N 
Z,¥,) = ee XI N-1 2a Ny,ve N-1 
F(%, Ve) = 2C,9) oH | flrs 2 .NW 1 = i (9b) 








PE) = 69 Fi a5 || 


co JAI ds 


where f[z; 2, y] stands for Pearson’s distribution of type III (the ['-distribution), while 
of, [2; «] is a confluent hypergeometrical series and J,,(z) the Bessel function of the first kind. 

For each combination of components (r), % and v, are independent. Dependence is intro- 
duced by weighing the various combinations together into a total or complex. 








3. The variance estimate s? = X(2—Z)?/(N — 1) is distributed as 
ym »exp| - (N Oe | oreo 
wr or "t, ai “(NaI SA \KN-Dem aa 
Ay so = (= = i) 
The corresponding c.f. is 
Uli] = EG 1-4 4. Pr ll 
[ta] = re { ya" exp] "a ‘I: (11) 
I~ ya" 


4. The moments of the correlation surface of % and s? are 


mi; = % Cm. s (12) 
r 


Here the first moments mj, of the mean % are 


m= a, m= A+ab 
13) 
A 
Mig, = 375%, + Op, m4 = 855+ 64 a2 + af, etc. 





case 








432 Distribution of ‘ Student’-Fisher’ s t 


while the moments m’, of s? are given by the expression 





rs) 

j 2a Ti \ 2 | iar? ee 

n= Wi (7) Fl - 7, -i.7>- |. (14) 
2 


It is first found that Mig = Mj, m, =A+M,, | 


A+, Mu, 
a ae (15) 
_ M-3M}, A+M,) | 
sul N-1—° 
The higher characteristics are most easily expressed by the cumulants of the parent 
population x,, calculated from M, in the ordinary way. We thus find 
wy WN Ku = ye Xe 


N-1\2 (N—1)? N-1, 
Koa = ya Kat 2 a 


Voo 





Ko = ya yk = ye Ke 
eh (N —1)2 (N —1) 
a i a 














W = ya Ks +4 a K3Ke, 
N-1)\3 (N —1)8 (N —1)? (N —1)(N-2) (N —1) 
( V J og = ee eg t 12 we Kaka t+ 4a K3+8 Wa kK, 
eee r (16) 
Ko= ys yu = ya Xe 


‘N—-1\? N-1) N-1 


Gy) o~ge 4 N—v! 





N-1) 
ya Kas 


























W ye ks 12 ne Kha 24 aga 4 ON A, 
N-1\4 N-1)4 N-1)8 
( V ) a“ N? r+ 2a) Kg Ke 
(N —1)?(N —2) (N — 1) (4N?—9N + 6) 
+32 K5K3+ 8 xt 
Né 5°38 Né 4 
=~ 5) sie si _ 
+a) xexg+96 4 — (N32) 1) 3) N-1 








NG Kk, +48 Wi KE. | 
It is to be noted that both the mean and the variance of s? are greater than in the normal 





The regression of s? on 7 is found to be r 

2A XxC,Y-d z; Ay» x 

M,(s*) = A+ i =. (17) 

V-1 ey | 
>> Cd v; hp, x| ( 
rs) N. 


It is seen that the curve lies above the line s? = J. 








(14) 


arent 


(16) 


nal 








HANNES HYRENIUS 433 


5. From (9) and (10) the fr.f. of ¢ = = WN can be derived as 


s elas] 


F(t)dt = %C ~ 
«) re) rexp| -7 4 2A x2 |= =0 n= =om! 


pV +2m+n fe og 
( 2 WA dt 


14) r(~= - | E ‘“ wo ee: Ww’ 


6. The moments of ¢ can be expressed by the following formulae: 
(= —1+2m—- *) 


, , N — 1] ve 2 
= 5 pee Ve — 
vel?) XC, N E ] F ml See 


2 
r (* -1 5) 
eT Wee k N-1 
>) Fl -7: 9” =|. (19) 
iy 
2 
Moments of odd order are not reducible to simple explicit forms. For moments of even order 


the infinite series can be transferred to the differential equation 


Pe tas al (20) 








x (18) 














N-IT}t 
= 1. Nt 
3 C,mi,.we | 


where a = 3, 5, ..., ete. 
7. As for the variance of t, the following may be observed. Writing 
r (* —1+2m— *) 
~» @ E. 2 


‘ = Yr — 21 


2 








we have for the combination of components (7): 


soe 


vl) = [5+ ye Sy : (22) 


N-1, N(N-1) 


Vo,(t) = 5) So, + 2A 








dais, vi Si,]. 





For the whole complex we have 


x = Baa, [| FO os * | 


N(N-1) 
vi(t) = ol 5 + at | )s.. 
2(t) ~ N 2A 2 


-1 1 N(N- . 
v(t) es BQ7= 8 +3G~ 4 oF[Sa, — Si]+ 2G ——— et eS ») o rSy- %6,0,8,| ? (24) 
(r) (r) (r) 


Biometrika 37 28 


(23) 








j 
from which follows 














434 Distribution of ‘ Student’-Fisher’s t 

The variance is here decomposed into three parts which can be interpreted in the following 
way. 

The formula ¢ = (— M) ,/N/s for the whole complex means for the different combinations 
of components a non-central t, the variance of which is given by the sum of the first two terms 
on the right hand of (24). The first one gives a subnormal variance [ < (N — 1)/(N —3)], corre- 
sponding to the scheme studied by Robbins and Weibull (‘individual normal parent popula- 
tion with varying means’); this part may be called ‘the within-combination variance’. 
The second term may be referred to as ‘the non-centrality effect’, which indicates its 
meaning. The third term in (24) is the square of the difference between the mean of the ¢ of 


the combination r and that of the whole complex; it may be called ‘the centre-dispersion 
effect’. 


The size of the three parts is determined by N, « and y, where a 20, y > 0. 


The case a = 0, y = 0 is to some extent trivial, giving rise to normal variance in ¢. There 
will, however, usually be a certain centre-dispersion effect. 

For « = 0, y>0 the within-combination variance will decrease with increasing y. There 
will be no non-centrality effect but a centre-dispersion effect. For «+0 the non-centrality 
effect will increase with | «| but diminish with y. For y = 0 the within-combination vari- 


ance always equals the normal variance of t, but the two other parts can show considerable 
fluctuations, due to the magnitude of | « |. 


8. For numerical applications the moments of ¢ can be directly calculated by means of 
tabulations of confluent hypergeometrical series. © 


As an example, the results are given below from a compound normal parent population, 


which is further studied through experimental sampling in a subsequent section (see p. 437). 
The function is given by 


f(x) = 0-60¢[x; 95, 25] + 0-25¢[x; 100, 25] + 0-15¢[a; 120, 25], 


having the shape characteristics y, = ,/2, = 1-125, y, = £,—3 = 0-750. 
Choosing the sample size N = 5, we obtain 


vp(t) = 0-843 + 0-460 + 1-663 = 2-966; o(t) = 1-722. 


The three terms in v,(t) correspond to the within-combination variance, the non-centrality 
effect, and the centre-dispersion effect respectively. 


The values given may be compared with those of the normal case, v, = 2, 7 = 1-414. 


III. Samprine From f(x) = Xp; A[x; ;, A;] 


1. The general form of a compound normal distribution is 

n 
f(x) = > pele: Hi Aj). (25) 

ia 

n 
By writing By = LD pk Xi, (26) 
i=1 
its first moments are given as 
m, = By, Mz = (Boy — Big) + Boy, 


Ms = (By — 3Bo9 By + 2Biy) + 3(B,, — Byy Boy), (27) 
My = (By — 4Byq Byo + 6 Boy Big — 3Biq) + 6( By, — 2By, By + Big Boy) + 3Bog. 








a 





$$ rs 








ity 

















Hannes HYRENIUS 435 


2. The c.f. of % and v3 is given by 


¢ 4k) 
Ullah] = Uthtl = GT 1- Fe] exp] 3 epAsit2Nash+ 2Nuite) (99) 
i=1 i=1 2Ne 1 2A; | : 
“N 2 





We adopt the notation 























Uk ek dt 
/; =M+E;, A; — A+4;, my, = a (29) 
After expansions and reductions, we obtain 
Ulty ty] = EC[1 — 2dtg| NY 
(r) 
t 4 pm? a 2mY + mi {2 
; 2 (r) = a 
oe [ +m) T—oya,]N * N ; —2At,|N) 
_ 4ema+ Sumh+ omy o 
: N2 (1 — 2at,/N) 
be 2m} ty 4umy3 + 4m B 
rw a [al —2Aty)N * N(1—2At,/ NP N®  (1—2at,/N)~ 
eal , mez a OFF. 4Ami + 4mi} (30) 
2\N 1—2At,/N N2 (1—2At,/N)? Ne (1—2At,)N)e* 


Assuming the variation in ~; and A; to be limited, the series indicated above can be con- 
siderably abbreviated. Further reductions follow if the two sets of parameters are in- 
dependent. As a first degree of approximation we may choose to retain only terms in mY} 
and m{}: 

3. Converting the c.f. into the fr.f. of and v3, we obtain, after expansion and rounding 
off to the same degree, the following approximate expression for the distribution of % and 


Vg = Vy—2?: 
7 a A i A 
F[%, vg] = Lg x; 15 | {Frev-v(r) — m3 SFP —a41( Vy) — mel en ad Sitv—pu (V2) 


, mos 


+ FEW —2)fithv2)], (31) 
exp(— aye —1 


W—1) (2A: 
r( 2 )() 


4. If, in particular, all A; = A, we have m%) = 0, and (31) reduces to the first two terms in 
the expression (9a). If, on the other hand, all ~; = 4 = 0, mj will vanish, and we obtain 


mi 4 A 
F(Z, v2) = ZG x; 0, | Suv »( )— Te ge 0,5 [Aa 


r) 
+™H8 cw — 2) gf 2; 0, x | ite v+2( (v)}. (32) 





where f,y_1 (Ys) is the type IIT function 





The expression within brackets, corresponding to the combination 7, equals the result 
obtained by Quensel (1944), to the same degree of approximation, on the basis of ‘individual 


normal parent populations with varying variances’. 
28-2 














436 Distribution of ‘ Student’-Fisher’s t 


As for the variance of v,, we have 


»N- N-1) 
Wi Bh +3! 


which is greater than in the normal case. 





V_(V_) = 2 





(Bo — Boy), (33) 


5. The distribution of ¢’ = %/,/v, can be derived from (31). We first introduce 


Wh 











Np*\ 2 {2 (V/2A)}" (=) tn 
hawt) = exp( — 93 a rC) i+ra 
2 


where g is the ordinary ‘Student’ distribution, while h is a modification of a confluent hyper- 
geometrical series. The approximate form of the distribution of t’ can now be written as 


Fit’) = x, Ay giy(t') byt) + As gin (l’) Ayn’) + As ganiall’) veal’) 
. t’ 
+ AgGiniill’) Aan yolt’) + As Gano’) Mansell’) + Aggan(t’) hunt’) Jas?) 


+ Az ginir(t’) hywss)(t’) I sas nrescth (35) 


The coefficients A follow from those in (31) and, like the functions g, they vary with the 
combinations of components r because of m$ and m3. 


6. For the special case A; = A, reference may be made here to §II for generai treatment. 
In the special case 1, = Owehaveg = land A, = A, = 0. We therefore obtain the distribution 


of t= = IN as 








; Nw, N+2_, N+4 . 
F(t) = saffi ~ we aan + ya MBI sal!) — “pyr mgusal0) (36) 
or, by writing k= XG re (37) 
r 
the simpler form 
Ne N+2_ N+4_ ‘ 

F(t) = [1 =a z| Gan (t) + —3— RGansalt) — GF sall)- (38) 

, . N-1 12K 
The variance of t is V,(t) = ya !-arcararen ‘ (39) 


thus being less than if all individuals were drawn from the same normal population. 


IV. EXPERIMENTAL SAMPLING 


The theory presented above is completed and illustrated by three sets of experimental 
sampling from compound normal distributions. 














—— 


—— 





a 


9) 


al 




















Hannes HYRENIvS 437 
1. For illustration of § II the function 


f(x) = 0-60¢[x; 95, 25] + 0-25¢[x; 100, 25] + 0-15¢[x; 120, 25] (40) 
has been used. : 

This distribution has the mean 100 and the standard deviation 10. It has been deliberately 
constructed so as to deviate considerably from normality, being bimodal with the character- 
istics y, = 1-125 and y, = 6-750. 

The distribution here established can be taken as the model of an industrial production 
with three machines, the capacities of which are proportional to 60, 25 and 15 respectively, 
while the products, measured in one way or another, vary normally around 95, 100 and 120 
respectively, with the standard deviation of 5. The products are supposed to be mingled, 
and a sample is taken at random. 

The distribution is summarized in the following table: 











x f(z) x f(x) 
75— 79 5 110-114 254 
80— 84 103 115-119 492 
85— 89 747 120-124 535 
90- 94 2,243 125-129 233 
95- 99 2,949 130-134 40 
100-104 1,823 135-139 3 

105-109 573 
Total 10,000 




















When the exact form of the parent population is known the testing of a sample mean can 
be made by integrating 


A 
S(%) = XC,A| Zz; «,, x (41) 
(r) 
The following critical values of t = («— M)/o, are calculated from the distribution (41): 
P= 0-5 % 2-5 %, 97-5 % 99-5 % 
t= — 2-03 — 1-66 2-18 2-96 


The distribution of Z deviates considerably from normality (y, = 0-503, y, = 0-150), 
hence these great differences from the normal theory values. Using the value 1-96 in both 
directions would erroneously give 0-7 % under the lower limit and 3-7 % above the upper 
limit instead of the common value 2-5 %. 

By using Kendall & Babington Smith’s Random Sampling Numbers (1939), 1000 samples 
of N = 5 items each have been prepared. For the sample distribution of the 1000 pairs of 
values of the mean % and the variance estimate s®, it may be noted that the correlation 
coefficient is 0-63 (the exact value being 0-62). The regression of s* on Z is indicated below 
by a number of values: 


z 93 95 97 «299 «21101 103 105 107 
Male m0 30 34 50 88 125 145 168 184 
a\°) lexperimental 36 35 45 93 130 147 160 ~ 192 








438 Distribution of ‘ Student’-Fisher’s t 
For t = (— M)/s, the following experimental distribution was obtained: 














F(t) 
t 
- + 
0-0-0-9 243 338 
1-0-1-9 131 96 
2-0-2-9 92 14 
3-0-3-9 33 2 
4:0-4-9 22 ai 
5-0-5-9 13 1 
6-0-6-9 6 
7-0-7-9 5 a 
8-0-8-9 2 oe 
9-0-9-9 1 — 
10-0- 1 
1000 

















This ¢-distribution is characterized by 
m=-—0-59, vg=3:22, o=1-79, y, =—1:57, y, = 496. 

All these values differ significantly from the normal theory v2lues. On the other hand, 
the observed m and o are in full agreement with the values directly derived by equations 
(23) and (24) (— 0-52 and 1-72 respectively). 

With 1000 sample values of t, based on 5 items each and drawn from a normal parent 
population, between 15 and 35 values can be expected to lie outside + 2-776. In this case, 
we find more than 100 observations below the lower limit but only 6 over the upper limit. 

2. As a second example the function 
was taken. f(x) = 0-8¢[zx; 0, 16] + 0-2¢[x; 0, 36] (42) 

Placing the centre at zero is, of course, conventional. The function is characterized by 
m = 0, 0 = 4-472, y, = 0, Y_ = 0-480. 

This distribution might be the result of the following production scheme. Two machines, 
having capacities proportional to 8 and 2 respectively, both give products having the same 
mean value; for the larger machine the standard deviation of the products is 4, while the 
smaller machine, being older, gives the greater dispersion of 6. The products are taken to 
be mixed and the sample is chosen at random. 

The distribution has the following numerical form: 











tx S(x) 

0- 1 2,672 

2- 4 2,105 

5- 7 1,054 
8-10 367 

11-13 104 
14-16 27 
17-19 6 
20- l 
10,000 
































ma © 2 06 © 

















HANNES HyRENIUS 439 


From this distribution 1000 samples of 5 items each have been drawn by means of random 
sampling numbers. The joint distribution of Z and s* shows no dependence here. For 
t = (—0) ,/N/s the following result is obtained: 














f(t) 

t 
- + 
0-1 300 299 
1-2 155 132 
2-3 41 27 
3-4 16 15 
4-5 6 2 
5-6 3 1 
6-7 — 2 
7-8 1 — 
1000 

















The following characteristics are obtained: 
m=—0-08, v,=1:94, ¢=1:39, y,=—O-17, y,=2-40. 

The deviations from normal theory values are non-significant for both mean, variance and 
asymmetry. The excess is finite, but in the normal case infinite. 

Expecting 15 to 35 observations in each direction lying outside + 2-776 when drawing 
1000 samples of 5 from a normal population, we find in the present case 30 values below the 
lower and 23 above the upper limit. 

The differences between this experimental material and the normal theory ¢-distribution 
can hardly be considered significant, even though the excess may indicate a subnormal 
occurrence of extreme values. It is to be observed that the restricting conditions under 
which the formuiae in § III were developed are not fulfilled in the present case. The variation 
of ¢ under the scheme ‘compound normal functions with varying variances’ still remains, 
therefore, undetermined. 


3. A third type of experimental sampling was carried out to illustrate the variations 
within different combinations of components. For this purpose a set of 500 items from 
Wold’s Random Normal Deviates (1948) was used in combinations of 100 samples of 5. 

Calculating the ordinary t = = /N/s, the following approximation to the normal theory 
distribution of ¢ was obtained: 





S(t) 











t 

- + 
0-1 27 20 
1-2 20 16 
2-3 6 3 
3-4 2 4 
4-5 oa = 
5-6 -— 2 

55 45 


























440 Distribution of ‘ Student’-Fisher’s t 


From this can be calculated m = 0-01, o = 1-682, y, = 0-732. 

o is found to differ from the expected value (,/2) by twice its mean error, while the coeffi- 
cient of asymmetry differs from zero by three times its error. The deviations are explained by 
the two extreme values. If these are deleted good agreement with the theory is found. 

This material was first used for illustrating the ¢-distribution in a combination of com- 
ponents, corresponding to equation (40). This function may be dimensioned down to one- 
fifth, thus having the component means at 19, 20, and 24 respectively, and the component 
variance equal to 1. We choose the combination r = (221), ie. 2 items from the first, 
2 from the second, and 1 from the third component. The corresponding parent variation is 
characterized by ’ = 20-40, o’ = 2-107. 

A series of samples from this scheme is obtained by subtracting 1 from two and adding 
4 to one of the five items in each of the previous 100 normal samples. 

Calculating t’ = (%’— M’) ,/N/s’ will give a series of quotients where the numerator is the 
same as in the ¢ previously calculated, but the denominator is approximately doubled. In 
fact, we have a central sub-normal ¢t with the following distribution: 














f(t) 
t 
- - 
0-0-0-3 24 16 
0:3-0-6 17 14 
0-6-0-8 10 7 
0-9-1-2 4 8 
55 45 

















The following characteristics were obtained: 
m=-—001, o=0-549, y, = 0-213. 
The standard deviation is in full agreement with that which can be directly calculated 
from formula (22) (0-482). 


If, on the other hand, we calculate ¢” = (%’—M) ./N/s’, we have a non-central ¢, the 
distribution of which is: 





S(t) 






































the 


ted 


the 




















Hannes HYRENIUS 441 


From this are calculated 
m=041, o=% 562, y, = 0-362. 


The standard deviation is now a little larger than in the previous case, having a value 
fully consistent with the theoretical value (0-493). 

The series of normal deviates was further used for elucidating the sampling process from 
the function (42). The combination selected for this purpose was r = (23), 2 items from the 
first component and 3 from the second. The corresponding sample is obtained by multiplying 
3 among the 5 items of each normal sample by 1:5. 

The sample means will have the expectation 0, but the variance will be greater; hence the 
distribution of t” = %” ,/N/s” may be expected to have a sub-normal variance. The 100 values 
are distributed as follows: 





























f(t) 
t 

~ + 
0-0-1-0 27 20 
1-0-2-0 19 15 
2-0-3-0 8 6 
3-0-4:0 1 2 
4:0-5-0 -— 1 
5-0-6-0 — 1 

55 45 

Here we obtain m=-—0-01, o = 1-622, y, = 0-616. 


The variance and the asymmetry are greater than expected, but less than in the original 
experimental series of normal theory ¢. If the aforesaid two extremes are deleted, most of 
the asymmetry also vanishes. No significant differences will then remain between the 
observed and the theoretical normal variance. In fact, the experimental variance is 
compatible with a decidedly sub-normal theoretical variance of t. 


V. COMMENTS AND SUMMARY 


The distribution of ‘Student’-Fisher’s t when sampling from a non-normal universe (prin- 
ciple b of p. 429 above) has been studied by means of experimental sampling by a number of 
authors. Reference may be made to works by G. A. Baker (1932), M. 8S. Bartlett (1935), 
A.N.K. Nair (1942), E. S. Pearson & N. K. Adyanthaya (1928, 1929) and H. L. Rietz (1939). 

Some of the studies show fairly small effects on the t-distribution arising from moderate 
non-normality in the sampled universe. Others, however, show decided deviations from the 
normal theory ¢-distribution, even for universes which cannot in any way be considered 
extremely non-normal. These somewhat disparate findings appear more consistent if we pay 
regard to the fact that the various parent populations have usually been characterized only 
by means of the first. two f-coefficients (or their equivalents). The experimental results 
together with various distributions derived theoretically, give the impression that character- 
izing the universe by a couple of parameters is unsatisfactory. The structure of the 
underlying variation also plays an important role. 











442 Distribution of ‘ Student’-Fisher’s t 


By varying the parameters of the compound normal functions, these can be made to 
describe a variety of different distribution types. Two examples have been presented here, 
both capable of being directly related to practical problems. It does not seem necessary, 
however, to maintain such a logical construction of the variation. Instead, the compound 
normal functions might be used as a more general way of studying the influence of non- 
normality on the ¢t-distribution. 

It was shown in the present paper: 


1. That for compound normal functions with varying component-media, the t-distribu- 
tion might differ considerably from the normal theory distribution, being decidedly skew 
and having a larger variance. 

2. That for compound normal functions with relatively small variation among the com- 
ponent-variances, the ¢t-distribution under certain limiting conditions has a sub-normal 
variance; for more pronounced variation of variances, the influence is, at present, not 
a priori determinable. 


REFERENCES 


Baker, G. A. (1932). Ann. Math. Statist. 3, 1. 

Barrett, M. 8. (1935). Proc. Camb. Phil. Soc. 31, 223. 

GayvEn, A. K. (1949). Biometrika, 36, 353. 

Geary, R. C. (1936). J. R. Statist. Soc. Suppl. 3, 178. 

Geary, R. C. (1947). Biometrika, 34, 209. 

HyreEntus, H. (1949). Skand. AktuarTidskr. 32, 180. . 

KENDALL, M. G. & Basrneton Situ, B. (1939). Tracts for Computers, no. 24. Cambridge University 
Press. 

LADERMAN, J. (1939). Ann. Math. Statist. 10, 376. 

Narr, A. N. K. (1942). Sankhyd, 5, 393. 

Pearson, E. 8. & Apyantuaya, N. K. (1928, 1929). Biometrika, 20A, 356; 21, 259. 

PERLO, V. (1933). Biometrika, 25, 203. 

QUENSEL, C.-E. (1938). Acta Univ. Lund. N.F. 34, 4, 1. 

QUENSEL, C.-E. (1944). Skand. AktuarTidskr. 27, 210. 

QUENSEL, C.-E. (1947). Skand. AktuarTidskr. 30, 44. 

Riper, P. R. (1929). Biometrika, 21, 124. 

Riper, P. R. (1931). Ann. Math. Statist. 2, 48. 

Rietz, H. L. (1939). Ann. Math. Statist. 10, 265. 

Rossrns, H. (1948). Ann. Math. Statist. 19, 406. 

WEIBULL, M. (1950). Skand. AktuarTidskr. 33. (In the press.) 

Wo p, H. (1948). Tracts for Computers, no. 25. Cambridge University Press. 














pore pgs 





1- 
al 

















[ 443 ] 


MISCELLANEA 


The comparison of pairs of treatments in split- plot experiments 
By J. TAYLOR, East Malling Research Station 


Consider an experiment in J randomized blocks with K primary treatments. Suppose further that each 
of the JK plots is divided into LZ subplots upon which L secondary treatments are disposed at random. 
The appropriate analysis of variance is well known (see, for example, Yates, 1937, p. 73), but there has 
been some doubt how to test the significance of differences between combinations of primary and 
secondary treatments with different primary treatments (as, for example, in testing whether the primary 
treatments differ for a given secondary treatment). This note presents a solution of the problem. 

Let main plots be distributed within blocks normally with variance a}? (measured on a subplot basis), 
and let the subplots be distributed within a main plot normally with variance 03. Then the difference of 
means of such a pair of treatment combinations as is being considered will have variance 


2 
7 (oi +09). 


An estimate of this is (A, S?+A,S3), where S? and S} are the main plot and subplot error mean squares, 
2 2(L—1) 

JL JL ~ 

Wishart (1940, p. 28) recommended testing the difference of means by a t-test, with ,/(A, S}7+A,S3) as 
standard error and the lesser of f, and f,as degrees of freedom, for safety. An exact solution has been 
made possible by the recent work of Welch (1947, 1949) and Aspin (1948, 1949). A difference in 
notation must be mentioned lest it be thought that their method cannot validly be applied. They discuss 
an estimate having true sampling variance (A,¢?+A,03), and let S? and S3 be estimates of oj and 03 
based on /, and f, degrees of freedom. In our case the estimate has sampling variance {A, (Lo? +03) +A,o3} 
and S? and S? are independent estimates of (Lo? +03) and o3. 

The tables given by Aspin (1949) for certain significance levels and combinations of /, and f, may be 
used, or the approximation suggested by Welch (1949). This consists in carrying out a t-test with degrees 
of freedom F’, given by 





their degrees of freedom being respectively /, and f,. Also A, equals and A, equals 





bese a + (1-—c)? 
Ff, fe ‘(J—-1)(K=-1) | (J -1) K(L-1)’ 
A, Si St 


where c = co 


LS+AS — H+ o—ys 
It will be seen from this that Wishart’s method erred on the side of safety, as was intended. It is of 
interest to note that the same result can also be derived from the approximate distribution given by 
Satterthwaite (1946). 
As an example of split-plot analyses Yates (1937) discussed a trial with oats. Three varieties were 
assigned to main plots in six randomized blocks, and four levels of nitrogenous manuring were applied 
to subplots in each main plot. The analysis quoted, carried out on plot yields in units of }1b., gave 


S? = 601-33 (f, = 10) 





and S3= 177-08 (f, = 45). 
Hence M(AyS3+AqS3) = 9-716 
and 008-8 = 0-5309. 





° = 601-33 + 3(177-08) 


Suppose we wish to compare the mean yields per plot of two varieties for a given level of manuring» 
carrying out a two-sided test at the 5 % level of significance. Following Wishart’s method we should 
carry out a t-test with 10 degrees of freedom. Then ¢ = 2-228, and the difference of means will be adjudged 
significant if it exceeds 2-228 x 9-715 = 21-65 units. Using Welch’s approximation we first calculate 


. (0-5309)* | (0-4691)* 
F110 45 ” 

















444 Miscellanea 


giving F = 30-2. The corresponding value of ¢ is 2-041, so the difference of means may be adjudged signi- 
ficant if it exceeds 2-041 x 9-715 = 19-83 units. This is 8-4 % less than the value given by Wishart’s method. 

Applications of the method if Latin squares or some other lay-out be used instead of randomized 
blocks, can readily be obtained. 


REFERENCES 


Aspin, A. A. (1948). An examination and further development of a formula occurring in the problem 
of comparing two mean values. Biometrika, 35, 88-96. 

Aspin, A. A. (1949). Tables for use in comparisons whose accuracy involves two variances, separately 
estimated. Biometrika, 36, 290-3. 

SATTERTHWAITE, F. E. (1946). An approximate distribution of estimates of variance components. 
Biometrics Bull. 2, 110-14. 

WE cu, B. L. (1947). The generalization of ‘Student’s’ problem when several different population 
variances are involved. Biometrika, 34, 28-35. 

We toz8, B. L. (1949). Further note on Mrs Aspin’s tables and on certain approximations to the tabled 
function. Biometrika, 36, 293-6. 

WisHart, J. (1940). Field Trials: Their Lay-out and Statistical Analysis. Imp. Bur. of Plant Breeding 
and Genetics. 

Yartss, F. (1937). The Design and Analysis of Factorial Experiments. Tech. Commun. Bur. Soil Sci. 
no. 35. 


On the best i-nbiased quadratic estimate of the variance 
By H. NAGLER 


The well-known formula = s (x; —Z)?/(n -1), (1) 
1 


for the estimate of the variance of a population from which n independent sample observations x; have 
been drawn, was shown by Hsu (1938) and later by Halmos (1946) to be a best unbiased estimate. In the 
present note this result will be derived afresh from quite elementary considerations. 

An expression is said to give an unbiased estimate of a parameter @ if its expectation equals 0; the 
estimate is called best if, in addition, it has the least variance of all unbiased estimates. We begin by 
showing that if F(x, ...,), or F(,) for short, is a function whose expectation is H(F’), then there exists 
a function symmetric in the z,, with equal expectation, but with smaller variance, except when F(z,) 
is already symmetrical, in which case the variance remains unchanged. 

For the proof consider the points P = (x, ...%,) of the sample space such that 


2, S%g<...<Saq, 


and with each such point P associate the points whose co-ordinates have the same value, but in all possible 
orders. As an example, if P = (1 23), associate with it the points (13 2), (213), (231), (312), (321). Let 
p denote the probability of P and let the suffix (m) stand for the permutation of the co-ordinates x, ... #,. 
Then a new function G(x,;) may be defined by the rule 


G(x) = Pom = = Pom) F (xm), (2) 
(m) (m) 


so that the value of G(x,) at each of a number of associated points is given by the weighted mean of F(2;) 
over all associated points. From this it follows that Z(F) = H(@); that if #(x,) is already symmetrical 
then F(x,) = G(zx,), whilst, if it is not, then its variance is greater than that of G(z,). The truth of this 
last assertion may be demonstrated by forming associated points into groups. The variance is composed 
of between-group and within-group variation; the former is the same for the functions F(x,;) and G(z,), 
whilst the latter vanishes for G(z,), but not for F(z,). 

The preceding result does not exact any limitations on the nature of the distributions of the x,; their 
distributions need not be the same, nor need the variations be independent. In order to be able to use this 
result in the case where the function used for estimating @ is of a prescribed form, we must make certain 
that the law of formation of G(x,) from F(x,), as given by (2), leads, when F(x,) is of the prescribed form, 
to a G(x,) which is also of that form. A symmetric quadratic form has two disposable constants; if F(x,) 
is a quadratic form and the number of groups of associated points is in excess of two, then, in general, 
G(x,), as defined by relation (2), will no longer be a quadratic form, 











whe 


th 


=e 

















Miscellanea 445 


The principal circumstance in which, however, G(x,) will remain a quadratic form is that in which the 
value of 7m) is the same at each of the associated points in a group. This will always be the case when the 
x, are independent and have, as we henceforth assume, the same distribution. But it is also possible when 
the x, are dependent, as in the example of an urn, containing tickets with numbers written on them, 
from which n tickets are removed at random, and placed at random, one each, into n other urns, numbered 
from 1 ton. Here, if x; is the number written upon the ticket which comes to be placed in urn i, the x, are 
not independent, but the probability of obtaining a particular set of values x; does not depend on the 
order of these values. 

If, then, affairs are such that G(x,) will be a quadratic form provided F(z,) is, it follows that a quadratic 
form which is unsymmetrical in the x; cannot have minimum variance; and so the quadratic form of the 
best unbiased estimate will have to be symmetrical. 

If, furthermore, we may assume that 7,,) is identical at each of a group of associated points, it follows 
that.the probability of a particular set of r<n values being assumed is the same for any r of the n sample 
observations x;,. For r = 2 this implies that the correlation between any two different sample observations 
x, and z, is the same. Let it be denoted by p, and let 4 and a? denote the common mean and variance of 
a E(a,x;) = p*+0% wheni +), 
and =p*+07% wheni=j. 


Now a quadratic form symmetrical in the x; may be written in the form 
n n 2 
Q= nEat+h( Ex.) 
1 1 
where h and k have to be determined from the condition H(Q) = o?. But 


E(Q) = hn(u? + 0) + kn(u* + 0%) + n(n — 1) (u* + 0%p) 
= pn(h+nk) + onfh + k[1 + (n— 1) pj}. 


We must therefore have h+nk =0 

and n{h+k[1+(n—1)p}} = 1, 
whence h = 1/{(n—1)(1—p)} 
and k = —1/{(n—1)(1—p) n}. 


This gives for the best unbiased quadratic estimate of the variance 
n 
= 2 (BPC —p)(n—1)}, 
n 
where = La,/n, 
1 


the sample mean. Formula (1) is a particular case of this for independent sample observations. 


Now by a well-known theorem p2=—1/(n—1); 
n 
it follows from this that 8*> ¥ (4,—2%)*/n, 
1 


not only when the sample observations are independent, but also when they are equally correlated. 

In conclusion, it may be mentioned that the same reasoning is applicable to a number of other estima- 
tion problems, notably those dealing with moments of distributions. Halmos has given a full discussion 
of estimation for the less general case of independent samples. 


REFERENCES 


Hatmos, Paut R. (1946). The theory of unbiased estimation. Ann. Math. Statist. 17, 34. 
Hsu, P. L. (1938). On the best unbiased quadratic esticaate of the variance. Statist. Res. Mem. 2, 91. 








446 Miscellanea 


The cumulants of the first m natural numbers 
By A. STUART 


1. The discontinuous rectangular distribution formed by the first n natural numbers often occurs in 
ranking problems. Its moments about the origin zero can be obtained from the well-known relation 


n 1 
a = Pam hea a 1), 
s= 


where ¢,,,(n+ 1) is the Bernoullian polynomial of degree (r+1) in (n+1) given by the coefficient of 
6r+1/(r +1)! in the expansion of O{e"+? — 1}/(e9 — 1). Thus 


i 1 
4 = nr +1) Prai(r+ 1), (1) 
and if we obtain the moments about the mean by the usual method, we find 
AM = 4(n+1), 
My = Pe(n*—1), 
1 
aS ae B= > 2 
Me = 5-99 (MY) (3n*-7), (2) 
1 
a Pig 4 2 ’ 
Me = Faq1g (— 1) (3n4— 1808 + 31) 





The moments of odd order are zero by symmetry. The expression for 4g, becomes more complicated as 
r increases. 


2. The cumulants do not appear to have been derived explicitly, although the remarkably simple 
formula given below is implicit in the work of Craig (1936) on average corrections for grouping in 
discrete distributions. The characteristic function of the distribution about its mean }(n + 1) is 


n 
P(t) = z exp {O[x — 4(n + 1)]}}/n, 
T= 
where @ = it. Summing the geometrical progression, we have 
P(t) = exp {— 4(n— 1) 6} (e*? — 1)/n(e? — 1) 


= sinh 4n@/nsinh }0. (3) 
(3) may be written P(t) = 40 sinh $n0/4n8 sinh 46, 
and the cumulative function is therefore 
Wit) = log (sinh $n8/3n0) — log (sinh 36/39). (4) 
Now log (sinh 42/}x) = E Bat rr! 


(Kendall, 1943), where B, is the rth Bernoullian number, given by the coefficient of 0"/(r!) in the expansion 
of 0/(e9 — 1). 


Thus (4) becomes p(t) = >» B,,(n** — 1) 6°" /2r.(2r)!, 
r=1 
whence Ker = B,,(n® — 1)/2r. (5) 
For example, Ky = 7x(n*—1), 
Ky = tho(l —n‘), 
Kg = ahz(n*-1), (6) 
Kg = tio Aa n§), 
Kio = Th3(n? — 1). 
REFERENCES 


Ceara, C. C. (1936). Sheppard’s corrections for a discrete variable. Ann. Math. Statist. 7, 55. 
KxnpDatt, M, G. (1943). The Advanced Theory of Statistics, 1, 78 (note). London: Griffin and Co. 























-_ -— * ee 


sin 


L of 


(1) 


(2) 


(3) 


(4) 


on 


(5) 


(6) 














Miscellanea 447 


Note on the xy? smooth test 
By D. A. S. FRASER, University of Toronto 


1. In his recent contribution to Biometrika, Seal (1948) presents a theorem to be used in generalizing 
the x? test of a theoretical frequency curve or of a graduation of a mortality table. Although incorrectly 
stated in his paper, the theorem produces a rather striking result for the normal] distribution and is 
then applied as an approximation to the multinomial distribution or to the joint distribution of a series 
of binomial distributions. 

The proof indicated is, however, applicable to the following corrected statement: 

THEOREM. If x,(i = 1, 2,...,) are the residuals of n independent normal variates, with means zero and 
variances one, after removing the regression on k linearly independent vectors, then the probability 
distribution of g? = Xz? and the distribution of the signs of the residuals are independent. (The x, are the 
residuals of n independent normal variates with means zero and variances one, after fitting k independent 
homogeneous linear constraints by regression.) 

The proof follows by showing that the conditional distribution of the signs, given q?, is independent 
of g?. Because the original n normal variates were independent and had unit variances, the distribution 
in n-space is spherically symmetrical and any orthogonal rotation of the n-space will give independent 
variates. Hence the probability density function of the residuals is the conditional distribution of the 
original normal variates in the linear subspace determined by the constraints, and has the following form: 


cexp[—}2 2%], 


where c is a constant depending on the number of constraints and the x; are connected by the k con- 
straints. For each value of q the density is a constant. To find the probability of any particular 
sign pattern is then to find the proportion of the surface of a hypersphere (in the subspace) which is 
contained in a particular ‘quadrant’ of the n-dimensional space. Since the sphere has centre the origin, 
this is independent of g, which is the radius of the hypersphere. 

In the statement of the theorem in §1, the residuals were to have means zero and variances one. It is 
interesting to note that frequently this situation is impossible, depending on the selection of constraints. 
An example will illustrate this: Let y,, y, be independent, normal, and with means zero and variances 
one. Let x1, x, be the residuals after fitting the constraint y, = 0. Then the variance of x, is zero. No other 
selection of a joint distribution for y, and y, would overcome this. 


2. Seal applies this theorem to testing the graduation of a mortality table where more than one 
constraint is applied. Although there is independence between the usual x? and the sign patterns, the 
basis for the suggested sign tests no longer exists. When the single constraint 2; = 0 is applied, all sign 
combinations are equally likely except two—all positive and all negative, which cannot occur. With 
additional linear constraints, the sign patterns will no longer be equally likely, wnless the constraints are 
of the form & + 2; = 0 which is not the case for the graduation of the mortality table. It is worth noting 
that for each additional constraint of the form = + x, = 0, two sign patterns become impossible, but the 
remaining sign patterns have equal probability. 


3. In her paper in Biometrika, David (1948) discusses the effect of applying a sign test to a mortality 
graduation using the moment type of constraint without taking account of the sign patterns which no 
longer have equal probability. The conclusion is that, as the number of moment constraints increases, 
the correlation between adjacent deviations approaches — 1 and consequently the sign test as proposed 
is of doubtful validity. 

At the end of §2 of her paper, the relation 

Oy = cos! ry 


is presented without proof. The proof follows immediately by noting that the rearrangement of the 
quadratic exponent implies that z,,...,z, are independent with equal variance, and hence the expression 
for the correlation between two linear combinations of the z, will be identical with that for the cosine of 
the angle between the two planes determined by the linear combinations set equal to zero. 


4, Using the theorem in §1, a sign test can be constructed by calculating the probability for each sign 
pattern—cr for enough ‘extreme’ patterns to form a rejection region of the proper size. This would 
involve interation to find the proportion of the surface of a hypersphere (in the subspace satisfying the 
constraints) which is contained in the ‘quadrants’ of n-space corresponding to the appropriate sign 
patterns. The work necessary to calculate these probabilities would be prohibitive unless n is small. 

If it is desirable to combine the usual x? test with a test based on signs, the sign patterns could be 
ordered by reference to a ‘reasonable’ criterion or, perhaps better, by reference to a representative 











448 Miscellanea 


alternative hypothesis. From the data, the probability of a more extreme pattern could be calculated, 
transformed to a y? with p degrees of freedom, and combined with the x? on n—k degrees of freedom to 
form a combined x? on n —k + p degrees of freedom. The choice of p will determine the relative weight to 
be given to the two tests. It should be made by considering alternative hypotheses, but this seemingly 
would be a prohibitive task and might better be left to the discretion of the statistician in weighting the 
tests before observing the data. Seal’s choice of p = 1 (1948) would appear to the writer to underestimate 
greatly the sign test except perhaps when n —k is small. 


The author wishes to express his appreciation of valuable discussion with Dr Seal. 


REFERENCES 


Davin, F. N. (1948). Correlations between x? cells. Biometrika, 35, 418. 
Srat, H. L. (1948). A note on the y* smooth test. Biometrika, 35, 202. 


An alternative form of y* 
By F. N. DAVID 


1. We consider a population, II, divided into & groups or strata. The proportion in the ith stratum is 
pdt = 1,2,...,%). A sample of N observations is randomly drawn from II, the number coming from the 
ith stratum being n,(i = 1,2,...,%). If the functional form of the population is fully specified, the pro- 
portions p,; are known, and the quantity commonly calculated in order to test whether the sample is 
representative of the population IT is ' 


(n s—Np,)* _ & ce 


‘= —N. 
. 2 Np, i= Np 
This quantity has as its first two moments 
&(x*) = k—-1, 
o7, = Ak—1 (1 -5) +5 -— 
tet saz Pi 


and provided NV is large, it is known that it may be assumed to be distributed as a Pearson Type III 
variable with range zero to plus infinity. Recently, and in quite another connexion, Neyman (194°, 


p. 239) has considered the quantity 


7 > (n,—Np,)* =: 3 Np} 


X1 —N. 


i=1 m4 i=1 % 

This quantity has certain advantages from the computational point of view, and it appears of interest 
therefore to obtain some idea of how closely the distribution of xj approximates to the limiting distribution 
of x*. 


2. We take the expression 
2? k 
M+N = a = 2 


i=1 % =m, 


144° ’ 
Np; 





and expand the denominator. This will be legitimate only for 
|n,»—Np, |< Np, 


but we may assume that N is aes large for this to be true in all but a negligible proportion of cases. 
Thus 





(n,— Np,)* (n,—Np,)*? | & (n,—Np,)* 
+ eens 
Pan ae ae ee 
a series which is in increasing powers of 1/N. On taking — we have to order 1/N?*, 


> 12 
é k-1+— [2 =~ 3e+1] +5 [25 —+Tk—- 1]. 
Ga) = N +, Pi NLApi 245, 














ee 
NL 





= & aS 








ee ee 





Miscellanea 449 


3. By taking N large enough, this last expression can be made to differ from k— 1 by as little as desired. 
We should, however, try to obtain some idea as to how large N should be before this is approximately 
true. If the observed data are not grouped, then it is often convenient to divide the population so that the 
expectations are equal for each group; that is to say, we may make 


R= ; (4 =.1,2, .... Bb. 


This arrangement will present the most favourable case for the mean of x7. Assuming that the population 
is divided in this way, we shall have 


&(x3) = k-14+— wh- Dhl +5, (k—1)[6k(k—1) +1], 


N?' 
and the closeness of the approximation to the x? first moment will therefore depend on the number of 
groups as well as the number in the sample. In order that the term 1/N? shall be negligible we shall take 
N > 400 and k = 5,10, 15 (Table 1). For these values the divergence from the y* moments is not marked; 
the true values are given in brackets. 


Table 1. First moment of x? 








k=5 10 15 

wed (4) (9) (14) 

400 4-09 = — 

500 4:07 9-34 —_ 
1000 4-04 9-17 14-41 




















4. The evaluation of the second moment of x? presents little difficulty in principle but is laborious 
algebraically: 














k (n,— a wae s (n,—Np,)* 
Rie it 2 3 ai 
(xi) Py Np BSS 2 N‘p} 
(n,-—Np, ae (n,— Np,)* (n;— Np,)* 

+ —_————— i. . 9 

4p N*p.p; 1» N*p.95 

(n,— Np,)* (ny— Np,)* (n,—Np,)* (n;—Np,)® 

2 

z Pe N‘p,p} me Pee N pip} + 


The expectations of the numerators in the single sums are immediate since they are just binomial moments. 
The evaluation of the expectui:ons of the numerators of the cross-products is more difficult. They are not 
listed in Haldane’s useful paper (1937), and it was found necessary to obtain them de novo. The quickest 
method of performing this evaluation appears to be by making use of two series of standardized inde- 
pendent characteristic random variables. The procedure is the usual one, with the exception of the 
standardization of the variables. After a certain amount of substitution and reduction we obtain to 
order 1/N 


1 1 
= = aa Baws mans 
03, = 2k n+x] 4k aak-+ 144225 |. 


If all the population proportions are equal, this reduces to 


2(k — 1) (9k —7) 
its 


It would perhaps have been more satisfactory to evaluate this expression to order 1/N?, but this would 
have entailed adding more terms to the expression given for (x7)?, and the result which would be achieved 
did not seem worthy of the labour which would be involved. 

5. For the same values of N and kas in Table 1, we give the second moment of xj in Table 2; the limiting 
values are given in brackets. The divergence from the limiting form of the moments is greater in this case 
than for the mean, which might be expected. 

6. Some idea of when it is reasonable to regard y? as being distributed according to the limiting form 
of x? may be obtained by considering the expansions in terms of the expectation in a single cell. Thus if we 
let m be the expectation for every cell, 

N = mk, 


Biometrika 37 29 











450 Miscellanea 
Table 2. Second moment of xi 









































k=65 k=10 k=15 
N 
(8) o(2-83) #y(18) o(4-24) (28) o(5:29) 
400 8-76 2-96 — — wee ab 
500 8-61 2-93 20-99 4-58 _ —_ 
1000 8-30 2-88 19-49 4-41 31-58 5-62 
and the expansions become 
2k 6k eo 1 
E(x) = b-14 45 +| -S tata ee | 
18k 32 14 
ot = 4k) ++ = ae by 


The term in 1/N?*, not given here, in the second moment will contribute, among others, a term in k/m?*. It 
is reasonable for the size of sample we are considering to ignore all terms except those in k divided by a 
power of m. The other terms will make a contribution, but it will be of smaller order, and for the purposes 
of our argument it is not necessary to consider them. We note that the multiplying factor of k/m? is of the 
order k, and this will imply that it is necessary for k* to be negligible compared with m? before the moments 
of xj approach nearly to those of y*, and for k to be negligible compared with m before the moments are 


approximately those of y*. It is, however, difficult to judge gn moments alone, and so we take another 
approach. 


7. In the limit, x? is distributed as y*, a Pearson Type III curve with range zero to plus infinity. 4? is 
a sum of squares, which can be zero, and we therefore assume, as an approximation to the true distribution 
of xj, the curve 

= ae re-sxt (O<y? 
P(XD) = FEZ (A) e (0<xi< +0), 

where r and s are positive constants. We estimate r and s from the first two moments found for x?, and 
calculate the probability in the upper tail cut off by the tabulated value of y*. The results for N = 1000 
are given in Table 3. The value for k = 16 is certainly no larger than it should be. The effect of neglecting 
terms of order 1/N* appears on a further preliminary investigation to mean that we neglect a positive 
contribution to the variance which would, if anything, make the figure 0-065 larger, and it would appear 


therefore that for this number of groups there may be a serious risk of error in using xj and the tabulated 
values of x, 


Table 3. Probability integral of xj corresponding to the wpper x3, tail (N = 1000) 





k 5 10 15 





P{x? > x8-0s} 0-053 0-058 0-065 











8. Itis perhaps of interest to compare the true y? moments under similar conditions. When all the p’s are 
equal we have 


E(x") =k-1, 

2(k—1) 
oye = 2(k—1)— yn: 
The mean, as always, is exact. We see that provided k is negligible compared with the sample size (not 
the cell expectation this time), the variance of y* has approximately its limiting value. Certainly for 
N = 1000 the difference between the limiting y* tail and the true x* tail will be inconsiderable, even 
for k> 16. 





| 
| 
| 
| 
| 


———_ 








gE uy 

















Miscellanea 451 


9. From the foregoing analysis we would draw the conclusion that the quantity 


= > (n;—Np,)? 

i=1 Nn; 
should not be used instead of y? for testing goodness of fit. It is true that it is easier to calculate in that 
one divides by (or multiplies by the reciprocal of) a whole number, but the ease of computation is more 
than counterbalanced by the inaccuracy which may occur in testing significance. Neyman uses yj as 
a means of obtaining asymptotically normal unbiased estimates and the results of this investigation are 
not necessarily applicable to his work. 


REFERENCES 


HALDANE, J. B. 8S. (1937). Biometrika, 29, 133. 
Neyman, J. (1949). Contribution, pp. 239-73, included in Proceedings of the Berkeley Symposium on 
Mathematical Statistics and Probability. University of California Press. 


On a theorem concerning the secondary subscripts of deviations in multivariate 
correlation using Yule’s notation 


By K. N. CHANDLER 


In his original paper G. U. Yule (1907) enunciated the theorem (§ 7): 


I. ‘The product-sum of any two deviations of the same order, with the same secondary suffixes, is 
unaltered by omitting any or all of the secondary subscripts of either.’ 

The converse was stated: 

II. ‘The product-sum of any deviation of order p with a deviation of order p+q, the p subscripts 
being the same in each case, is unaltered by adding to the secondary subscripts of the former any or all 
of the g additional subscripts of the latter.’ 

The proof of I and II therein contained proves theorem II and also a more general theorem than I, 
namely: 

III. The product-sum of any two deviations in which all the secondary subscripts of the first are 
secondary subscripts of the second, is unaltered by omitting anv or all of the secondary subscripts of 
the first. 

This does not require, as does I, that we start from subscripts of the same order. The difference between 
I and III is really trifling, as III can easily be deduced from I. 

It is of theorem ITI that IT is the converse. 

In a number of modern texts, neither theorem I nor III occurs; but there is a somewhat similar theorem 
(e.g. Yule & Kendall, 1937, p. 265): 

IV. ‘The product-sum of any two deviations is unaltered by omitting any or all of the secondary 
subscripts of either which are common to the two.’ 

Theorem II is stated and called the converse of this; but, in fact, theorem IV is not true; for if it were, 


should ha 
we " Ua 21.93 = Uy 2%_3, 


whence Uy. (Xy — O49, 37g — Hyg, 2%3) = Day, o(2%, — byg%g), 
i.e. Lay 9% — Oyg.g UX gXg = Vay 9X, — Fig U2, 2%s, 
i.e. Bis. = 91g OF Tr, 9% = 0. 

(a) If bis. = dys, 
then “se = are 
i.e. Ye9(712—T 1323) = 9, 


0117137 23 — 7 
t PR te 
bu 12.3 i ae 








452 Miscellanea 


so that, if bis.2 = 243, 
then either 1,,=0 or 1 or —1 
or bi2.3 = 90, | 
which, in general, is not so. 
(6) Alternatively, if Ux, 2g; = 0, 
then 22, 9%. = 0, 
i.e. T13.2 = 0, 
or bis 2 = 0, 


which again, in general, is not so. 


It would appear that the enunciation of theorem IV has not led to any unsound formulae being 
included in the texts concerned. 


REFERENCES 


Youre, G. U. (1907), Proc. Roy. Soc. A, 79, 182. 


Youue, G. U. & Kenpatt, M. G. (1937). Introduction to the Theory of Statistics, 13th ed. London: 
Griffin and Co. 


CORRIGENDA 


H. O. Lancaster, Biometrika (1949), 36 
n-1 a—1l ' 
p- 119, line 9: read > x? not & y?. 
i=1 4=1 


p. 122, 2 lines above equation (24): read Play; | a;., a@,;) not P(a,;, a;., a ;). 
p. 124, equation (36): read Xm,cy, not m,c,. 

equation (38): & should appear before ‘(observed —expected)?’. 
p. 371, 3 lines from end of §3: read (4+ 4)? not (4+ 4)4. 





ee ec, aa — 





et 4 





[ 453 ] 


REVIEWS 


Statistics, Volume I. By N. L. Jounson and H. Teriry. xii+294 pp. Cambridge 
University Press. 1949. Price 20s. 


The practical problems arising from the handling of fairly large samples of data have long occupied the 
attention of the actuary, and the training and development of the numerical sense of the student is a 
problem that has recently received considerable attention. 

The examinations of the Institute of Actuaries were suspended during part of the war period, and at 
the time of the resumption in 1945 consideration was being given to a recasting of the syllabus to secure 
that the examinations kept in step with modern developments and with the widening scope of the pro- 
fession. A new syllabus was determined upon and is being introduced by stages, starting from May 1949, 
and one of the consequences has been the creation of a need for a complete revision of the course of 
reading. The present volume represents the first part of a two-volume text-book to cover the main 
statistical requirements of Parts I and II of the new syllabus. Although the book has been written 
primarily for actuarial students there is every reason for it to have a considerably wider appeal as 
meeting a requirement not previously filled by existing text-books. 

In writing the book the authors have of necessity had regard to the mathematical standard required 
by the Institute of Actuaries, which is somewhat higher than that of the corresponding subjects in the 
Group I mathematics of the Higher School Certificate Examination of London University. In particular, 
it may be noted that these do not include complex variable analysis, and the book is, therefore, based 
entirely on real variable theory. This is, however, no disadvantage, as emphasis is laid throughout on the 
application of the subject rather than the development of elegant or elaborate mathematical analysis. 

The scope of the book is perhaps best illustrated by a quotation from the part of the examination 
syllabus designed to be covered, namely, Part I, Section B: 


‘ Statistics 

‘The classification and tabulation of data. Frequency distributions and moments. Representative 
measures of location, dispersion, skewness and kurtosis. Standard distributions. Elementary theory of 
sampling, including standard errors (large samples only). 

‘Note. The standard distributions covered in this section of the Syllabus are the binomial, normal and 


Poisson distributions; in addition the candidate will be expected to have only a general knowledge of the 
Pearsonian distributions.’ 


The general plan of the book is first to deal with the methods of descriptive statistics, including the 
calculation and interpretation of the usual statistical measures, and then, after a pertinent discussion of 
the questions of statistical inference and theories of probability, to deal with the simpler statistical tests. 
The second volume will deal with the fuller theory of sampling, small samples, graduation and curve 
fitting, ete. 

Since the book is designed primarily with appeal to the practical outlook, the first part of the plan 
concentrates on developing the student’s numerical sense, and this is achieved by the use of well-chosen 
diagrams and illustrative examples with a minimum of appeal to mathematical analysis. This occupies 
the first four chapters, representing roughly one-third of the book and takes the student up to linear 
regression and the correlation coefficient. 

Chapter 5 forms a brief introduction to the problem of statistical inference and prepares the student 
for the application of probability theory to practical problems. The fundamentals of the various theories 
of probability are then briefly sketched in Chapter 6, leading up to a frequency theory approach based on 
the experimental approach and theoretical development proposed by Kerrich. 

The authors then turn to the practical applications of the theory, and Chapter 7 is devoted to random 
variables and expected values. This chapter concludes with useful sections on the Tchebychef and Gauss 
inequalities. Following a well-balanced chapter on the binomial, Poisson and normal distributions, tho 
authors discuss the question of statistical hypotheses and then conclude with the more elementary 
statistical tests, restricting themselves to tests of means, standard deviations and proportions. 

The book is well supplied with examples at the end of each chapter, designed to cover the application 
of the subjects discussed and also to cover important methods or results which could not be included in 
the text. The range of difficulty of the questions is thus fairly wide, but hints on the solutions are provided 
which will aid the student in those cases where mathematical ‘tricks’ are involved. 











454 Reviews 


The standard of printing is on the usual high level of the Cambridge University Press, but there are un- 
fortunately some errors in this first edition. Of those in the text the following might mislead a student: 


N 
p. 71, the figure — 180 at the foot of the page should read — 1080; p. 190, Example 7:5, & 7; should read 
n 1 
2X 2,; p. 213, equation (8-42) should read a2-? instead of x?. The solutions to the following questions also 
1 


contain errors which might be misleading: 3-3, 3-9, 8-6, 9-2. Finally, there is a misprint in the table on 
p. 215, where the value against r = 13, g = -005 should read -0804 instead of -0894, and on p. 216 where the 
values against r = 9,11, g = -01 should read -1201 instead of -1183. These are, however, minor blemishes 
in a well-written book that can be corrected on a second edition. 

The book is written with a definite object in view, and in this it succeeds very well; the student working 
with the book will maintain a proper sense of proportion as to the position of statistical methods in his 
practical problems. There is, however, no easy road to the development of a proper numerical sense; 
it can only come from the acquaintance with data, and he will still need to supplement his study with 
more examples. The course of study covered by this book should provide the actuarial or other student 
with a valuable working acquaintance with modern statistical tools and techniques which he will be able 


to apply intelligently to his data, and the book should certainly be read by all concerned in the training 
of statisticians. 


R. E. BEARD 


: 





ct FAN, 


a" ne 
>” a 











i ae ee 











re 


